Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] TransformIndexerTests testMaxPageSearchSizeIsResetToConfiguredValue failing #109844

Closed
idegtiarenko opened this issue Jun 18, 2024 · 4 comments · Fixed by #109876
Closed
Assignees
Labels
low-risk An open issue or test failure that is a low risk to future releases :ml/Transform Transform Team:ML Meta label for the ML team >test-failure Triaged test failures from CI

Comments

@idegtiarenko
Copy link
Contributor

Build scan:
https://gradle-enterprise.elastic.co/s/u6xe6p6it5iwy/tests/:x-pack:plugin:transform:test/org.elasticsearch.xpack.transform.transforms.TransformIndexerTests/testMaxPageSearchSizeIsResetToConfiguredValue

Reproduction line:

./gradlew ':x-pack:plugin:transform:test' --tests "org.elasticsearch.xpack.transform.transforms.TransformIndexerTests.testMaxPageSearchSizeIsResetToConfiguredValue" -Dtests.seed=54112B9845B21E41 -Dtests.locale=fr-FR -Dtests.timezone=Asia/Macao -Druntime.java=22

Applicable branches:
main

Reproduces locally?:
Didn't try

Failure history:
Failure dashboard for org.elasticsearch.xpack.transform.transforms.TransformIndexerTests#testMaxPageSearchSizeIsResetToConfiguredValue

Failure excerpt:

java.lang.AssertionError: (No message provided)

  at org.junit.Assert.fail(Assert.java:87)
  at org.junit.Assert.assertTrue(Assert.java:42)
  at org.junit.Assert.assertTrue(Assert.java:53)
  at org.elasticsearch.xpack.transform.transforms.TransformIndexerTests$MockedTransformIndexer.waitForAfterFinishOrFailureLatch(TransformIndexerTests.java:317)
  at org.elasticsearch.xpack.transform.transforms.TransformIndexerTests.testMaxPageSearchSizeIsResetToConfiguredValue(TransformIndexerTests.java:601)
  at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
  at java.lang.reflect.Method.invoke(Method.java:580)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
  at java.lang.Thread.run(Thread.java:1570)

@idegtiarenko idegtiarenko added :ml/Transform Transform >test-failure Triaged test failures from CI labels Jun 18, 2024
@elasticsearchmachine elasticsearchmachine added Team:ML Meta label for the ML team needs:risk Requires assignment of a risk label (low, medium, blocker) labels Jun 18, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@idegtiarenko
Copy link
Contributor Author

Please note, there are 2 possible failures in this test (not sure if they are related):

One above caused by

يون 16, 2024 10:08:57 ص com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException	
WARNING: Uncaught exception in thread: Thread[#97,elasticsearch[generic][generic][T#3],5,TGRP-TransformIndexerTests]	
java.lang.AssertionError: expected:<10000> but was:<250>	
	at __randomizedtesting.SeedInfo.seed([928723993084ECCF]:0)	
	at org.junit.Assert.fail(Assert.java:89)	
	at org.junit.Assert.failNotEquals(Assert.java:835)	
	at org.junit.Assert.assertEquals(Assert.java:647)	
	at org.junit.Assert.assertEquals(Assert.java:633)	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexerTests.lambda$testMaxPageSearchSizeIsResetToConfiguredValue$8(TransformIndexerTests.java:589)	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexerTests$MockedTransformIndexer.onFinish(TransformIndexerTests.java:281)	
	at org.elasticsearch.xpack.core.indexing.AsyncTwoPhaseIndexer.onSearchResponse(AsyncTwoPhaseIndexer.java:542)	
	at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:249)	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexerTests$MockedTransformIndexer.lambda$doNextSearch$0(TransformIndexerTests.java:227)	
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:917)	
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)	
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)	
	at java.base/java.lang.Thread.run(Thread.java:1570)

another one:

[2024-06-17T13:14:20,030][WARN ][o.e.x.t.t.TransformFailureHandler] [generic] [tEnbHohzlJ] Transform encountered an exception: [java.lang.InterruptedException]; Will automatically retry [1/10]	
java.lang.IllegalStateException: java.lang.InterruptedException	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexerTests$MockedTransformIndexer.doNextSearch(TransformIndexerTests.java:221) ~[test/:?]	
	at org.elasticsearch.xpack.core.indexing.AsyncTwoPhaseIndexer.triggerNextSearch(AsyncTwoPhaseIndexer.java:643) ~[x-pack-core-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.core.indexing.AsyncTwoPhaseIndexer.nextSearch(AsyncTwoPhaseIndexer.java:630) ~[x-pack-core-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.core.indexing.AsyncTwoPhaseIndexer.lambda$maybeTriggerAsyncJob$2(AsyncTwoPhaseIndexer.java:238) ~[x-pack-core-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:249) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.lambda$onStart$6(TransformIndexer.java:302) ~[main/:?]	
	at org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:245) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.lambda$onStart$7(TransformIndexer.java:334) ~[main/:?]	
	at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:249) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.transform.transforms.common.AbstractCompositeAggFunction.getInitialProgressFromResponse(AbstractCompositeAggFunction.java:206) ~[main/:?]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.lambda$onStart$10(TransformIndexer.java:331) ~[main/:?]	
	at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:249) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexerTests$MockedTransformIndexer.doGetInitialProgress(TransformIndexerTests.java:172) ~[test/:?]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.lambda$onStart$13(TransformIndexer.java:330) ~[main/:?]	
	at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:249) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.lambda$createCheckpoint$0(TransformIndexer.java:246) ~[main/:?]	
	at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:249) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.transform.persistence.InMemoryTransformConfigManager.putTransformCheckpoint(InMemoryTransformConfigManager.java:75) ~[test/:?]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.lambda$createCheckpoint$3(TransformIndexer.java:244) ~[main/:?]	
	at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:249) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.transform.checkpoint.MockTimebasedCheckpointProvider.createNextCheckpoint(MockTimebasedCheckpointProvider.java:42) ~[test/:?]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.createCheckpoint(TransformIndexer.java:241) ~[main/:?]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.lambda$onStart$14(TransformIndexer.java:312) ~[main/:?]	
	at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:249) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.lambda$onStart$16(TransformIndexer.java:374) ~[main/:?]	
	at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:249) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexerTests$MockedTransformIndexer.doGetFieldMappings(TransformIndexerTests.java:270) ~[test/:?]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.lambda$onStart$18(TransformIndexer.java:381) ~[main/:?]	
	at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:249) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.lambda$onStart$19(TransformIndexer.java:395) ~[main/:?]	
	at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:249) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.transform.persistence.InMemoryTransformConfigManager.getTransformConfiguration(InMemoryTransformConfigManager.java:218) ~[test/:?]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.lambda$onStart$21(TransformIndexer.java:388) ~[main/:?]	
	at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:249) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexer.onStart(TransformIndexer.java:451) ~[main/:?]	
	at org.elasticsearch.xpack.core.indexing.AsyncTwoPhaseIndexer.lambda$maybeTriggerAsyncJob$3(AsyncTwoPhaseIndexer.java:235) ~[x-pack-core-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:917) ~[elasticsearch-8.15.0-SNAPSHOT.jar:8.15.0-SNAPSHOT]	
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]	
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]	
	at java.lang.Thread.run(Thread.java:1570) ~[?:?]	
Caused by: java.lang.InterruptedException	
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1100) ~[?:?]	
	at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:230) ~[?:?]	
	at org.elasticsearch.xpack.transform.transforms.TransformIndexerTests$MockedTransformIndexer.doNextSearch(TransformIndexerTests.java:219) ~[test/:?]	
	... 39 more

@prwhelan prwhelan self-assigned this Jun 18, 2024
@prwhelan prwhelan added low-risk An open issue or test failure that is a low risk to future releases and removed needs:risk Requires assignment of a risk label (low, medium, blocker) labels Jun 18, 2024
prwhelan added a commit to prwhelan/elasticsearch that referenced this issue Jun 18, 2024
maxPageSize's update functionality has been reprioritized.  If a user
calls the update API to change the max page size, that update will lock
out other threads from changing the max page size.  Once the lock is
released, the other threads will check again if they can reset the max
page size and otherwise keep the value that the user just updated with.

Fix elastic#109844
@prwhelan
Copy link
Member

Looks like a race condition between the test thread and the indexer thread. The test thread is updating the max page size before the indexer thread is, so the indexer thread replaces it with null. In a prod scenario, this is very unlikely to happen, the 0th checkpoint always runs before the user can make a second call (unless they're in a multithreaded client application), so the update call always runs after the indexer sets the value to null. I'd still call this a bug since the test thread is simulating the user, and we should always prioritize the user's most recent update call over the initial settings.

@elasticsearchmachine
Copy link
Collaborator

elasticsearchmachine added a commit that referenced this issue Jun 21, 2024
…ts testMaxPageSearchSizeIsResetToConfiguredValue #109844
prwhelan added a commit to prwhelan/elasticsearch that referenced this issue Jul 5, 2024
prwhelan added a commit that referenced this issue Jul 9, 2024
…0537)

This was fixed as part of PR#109876.

Relate #109844

Co-authored-by: Elastic Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
low-risk An open issue or test failure that is a low risk to future releases :ml/Transform Transform Team:ML Meta label for the ML team >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants