Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] TasksIT testGetTaskWaitForCompletionWithoutStoringResult failing #106043

Closed
benwtrent opened this issue Mar 6, 2024 · 3 comments
Closed
Assignees
Labels
:Distributed/Task Management Issues for anything around the Tasks API - both persistent and node level. medium-risk An open issue or test failure that is a medium risk to future releases Team:Distributed Meta label for distributed team >test-failure Triaged test failures from CI

Comments

@benwtrent
Copy link
Member

Seems like a weird race condition. Either .tasks index doesn't exist, or we checked its status too quickly after it was created or something.

Build scan:
https://gradle-enterprise.elastic.co/s/p6l3ddtlwelwk/tests/:server:internalClusterTest/org.elasticsearch.action.admin.cluster.node.tasks.TasksIT/testGetTaskWaitForCompletionWithoutStoringResult

Reproduction line:

./gradlew ':server:internalClusterTest' --tests "org.elasticsearch.action.admin.cluster.node.tasks.TasksIT.testGetTaskWaitForCompletionWithoutStoringResult" -Dtests.seed=D5FAF539943B576A -Dtests.locale=pl-PL -Dtests.timezone=America/Belem -Druntime.java=21

Applicable branches:
main

Reproduces locally?:
No

Failure history:
Failure dashboard for org.elasticsearch.action.admin.cluster.node.tasks.TasksIT#testGetTaskWaitForCompletionWithoutStoringResult

Failure excerpt:

java.util.concurrent.ExecutionException: org.elasticsearch.transport.RemoteTransportException: [node_s2][127.0.0.1:18702][cluster:monitor/task/get]

  at __randomizedtesting.SeedInfo.seed([D5FAF539943B576A:B3ABA602864640A6]:0)
  at org.elasticsearch.action.support.PlainActionFuture$Sync.getValue(PlainActionFuture.java:287)
  at org.elasticsearch.action.support.PlainActionFuture$Sync.get(PlainActionFuture.java:274)
  at org.elasticsearch.action.support.PlainActionFuture.get(PlainActionFuture.java:93)
  at org.elasticsearch.client.internal.support.AbstractClient$RefCountedFuture.get(AbstractClient.java:1534)
  at org.elasticsearch.client.internal.support.AbstractClient$RefCountedFuture.get(AbstractClient.java:1514)
  at org.elasticsearch.action.admin.cluster.node.tasks.TasksIT.waitForCompletionTestCase(TasksIT.java:617)
  at org.elasticsearch.action.admin.cluster.node.tasks.TasksIT.testGetTaskWaitForCompletionWithoutStoringResult(TasksIT.java:565)
  at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
  at java.lang.reflect.Method.invoke(Method.java:580)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
  at java.lang.Thread.run(Thread.java:1583)

  Caused by: org.elasticsearch.transport.RemoteTransportException: [node_s2][127.0.0.1:18702][cluster:monitor/task/get]


    Caused by: org.elasticsearch.ResourceNotFoundException: task [FzH5E4EYT1a-chDD8aVTxA:363] isn't running and hasn't stored its results

      at org.elasticsearch.action.admin.cluster.node.tasks.get.TransportGetTaskAction.lambda$getFinishedTaskFromIndex$6(TransportGetTaskAction.java:214)
      at org.elasticsearch.action.ActionListenerImplementations.safeAcceptException(ActionListenerImplementations.java:62)
      at org.elasticsearch.action.ActionListener$2.onFailure(ActionListener.java:179)
      at org.elasticsearch.action.ActionListenerImplementations.safeAcceptException(ActionListenerImplementations.java:62)
      at org.elasticsearch.action.ActionListenerImplementations.safeOnFailure(ActionListenerImplementations.java:73)
      at org.elasticsearch.action.DelegatingActionListener.onFailure(DelegatingActionListener.java:31)
      at org.elasticsearch.action.support.ContextPreservingActionListener.onFailure(ContextPreservingActionListener.java:39)
      at org.elasticsearch.action.ActionListenerImplementations.safeAcceptException(ActionListenerImplementations.java:62)
      at org.elasticsearch.action.ActionListenerImplementations.safeOnFailure(ActionListenerImplementations.java:73)
      at org.elasticsearch.action.ActionListener$3.onFailure(ActionListener.java:324)
      at org.elasticsearch.tasks.TaskManager$1.onFailure(TaskManager.java:214)
      at org.elasticsearch.action.ActionListenerImplementations.safeAcceptException(ActionListenerImplementations.java:62)
      at org.elasticsearch.action.ActionListenerImplementations.safeOnFailure(ActionListenerImplementations.java:73)
      at org.elasticsearch.action.DelegatingActionListener.onFailure(DelegatingActionListener.java:31)
      at org.elasticsearch.action.ActionListenerImplementations$RunBeforeActionListener.onFailure(ActionListenerImplementations.java:317)
      at org.elasticsearch.action.ActionListenerImplementations.safeAcceptException(ActionListenerImplementations.java:62)
      at org.elasticsearch.action.ActionListenerImplementations.safeOnFailure(ActionListenerImplementations.java:73)
      at org.elasticsearch.action.ActionListener$3.onFailure(ActionListener.java:324)
      at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:103)
      at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:68)
      at org.elasticsearch.tasks.TaskManager.registerAndExecute(TaskManager.java:196)
      at org.elasticsearch.client.internal.node.NodeClient.executeLocally(NodeClient.java:105)
      at org.elasticsearch.client.internal.node.NodeClient.doExecute(NodeClient.java:83)
      at org.elasticsearch.client.internal.support.AbstractClient.execute(AbstractClient.java:356)
      at org.elasticsearch.client.internal.FilterClient.doExecute(FilterClient.java:54)
      at org.elasticsearch.client.internal.OriginSettingClient.doExecute(OriginSettingClient.java:43)
      at org.elasticsearch.client.internal.support.AbstractClient.execute(AbstractClient.java:356)
      at org.elasticsearch.client.internal.support.AbstractClient.get(AbstractClient.java:456)
      at org.elasticsearch.action.admin.cluster.node.tasks.get.TransportGetTaskAction.getFinishedTaskFromIndex(TransportGetTaskAction.java:210)
      at org.elasticsearch.action.admin.cluster.node.tasks.get.TransportGetTaskAction.getRunningTaskFromNode(TransportGetTaskAction.java:138)
      at org.elasticsearch.action.admin.cluster.node.tasks.get.TransportGetTaskAction.doExecute(TransportGetTaskAction.java:90)
      at org.elasticsearch.action.admin.cluster.node.tasks.get.TransportGetTaskAction.doExecute(TransportGetTaskAction.java:60)
      at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:96)
      at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:68)
      at org.elasticsearch.action.support.HandledTransportAction.lambda$new$0(HandledTransportAction.java:50)
      at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
      at org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:288)
      at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:273)
      at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:115)
      at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:96)
      at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:821)
      at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:124)
      at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:96)
      at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:61)
      at org.elasticsearch.transport.netty4.Netty4MessageInboundHandler.channelRead(Netty4MessageInboundHandler.java:48)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
      at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
      at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
      at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
      at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
      at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
      at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
      at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
      at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)
      at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)
      at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
      at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
      at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
      at java.lang.Thread.run(Thread.java:1583)

      Caused by: org.elasticsearch.index.IndexNotFoundException: no such index [.tasks]

        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.notFoundException(IndexNameExpressionResolver.java:553)
        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$ExplicitResourceNameFilter.ensureAliasOrIndexExists(IndexNameExpressionResolver.java:1712)
        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$ExplicitResourceNameFilter.filterUnavailable(IndexNameExpressionResolver.java:1692)
        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.resolveExpressions(IndexNameExpressionResolver.java:252)
        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:340)
        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:299)
        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:285)
        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteSingleIndex(IndexNameExpressionResolver.java:632)
        at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction.<init>(TransportSingleShardAction.java:161)
        at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.doExecute(TransportSingleShardAction.java:106)
        at org.elasticsearch.action.support.single.shard.TransportSingleShardAction.doExecute(TransportSingleShardAction.java:53)
        at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:96)
        at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:68)
        at org.elasticsearch.tasks.TaskManager.registerAndExecute(TaskManager.java:196)
        at org.elasticsearch.client.internal.node.NodeClient.executeLocally(NodeClient.java:105)
        at org.elasticsearch.client.internal.node.NodeClient.doExecute(NodeClient.java:83)
        at org.elasticsearch.client.internal.support.AbstractClient.execute(AbstractClient.java:356)
        at org.elasticsearch.client.internal.FilterClient.doExecute(FilterClient.java:54)
        at org.elasticsearch.client.internal.OriginSettingClient.doExecute(OriginSettingClient.java:43)
        at org.elasticsearch.client.internal.support.AbstractClient.execute(AbstractClient.java:356)
        at org.elasticsearch.client.internal.support.AbstractClient.get(AbstractClient.java:456)
        at org.elasticsearch.action.admin.cluster.node.tasks.get.TransportGetTaskAction.getFinishedTaskFromIndex(TransportGetTaskAction.java:210)
        at org.elasticsearch.action.admin.cluster.node.tasks.get.TransportGetTaskAction.getRunningTaskFromNode(TransportGetTaskAction.java:138)
        at org.elasticsearch.action.admin.cluster.node.tasks.get.TransportGetTaskAction.doExecute(TransportGetTaskAction.java:90)
        at org.elasticsearch.action.admin.cluster.node.tasks.get.TransportGetTaskAction.doExecute(TransportGetTaskAction.java:60)
        at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:96)
        at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:68)
        at org.elasticsearch.action.support.HandledTransportAction.lambda$new$0(HandledTransportAction.java:50)
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
        at org.elasticsearch.transport.InboundHandler.doHandleRequest(InboundHandler.java:288)
        at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:273)
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:115)
        at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:96)
        at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:821)
        at org.elasticsearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:124)
        at org.elasticsearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:96)
        at org.elasticsearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:61)
        at org.elasticsearch.transport.netty4.Netty4MessageInboundHandler.channelRead(Netty4MessageInboundHandler.java:48)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at java.lang.Thread.run(Thread.java:1583)

@benwtrent benwtrent added :Distributed/Task Management Issues for anything around the Tasks API - both persistent and node level. >test-failure Triaged test failures from CI labels Mar 6, 2024
@elasticsearchmachine elasticsearchmachine added blocker Team:Distributed Meta label for distributed team labels Mar 6, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@idegtiarenko
Copy link
Contributor

I suspect this is related to the fix done to #97923
There are no recent production changes around tasks so I am going to change the priority of this one.
Feel free to update if you do not agree

@idegtiarenko idegtiarenko added medium-risk An open issue or test failure that is a medium risk to future releases and removed blocker labels Mar 12, 2024
@benwtrent
Copy link
Member Author

@idegtiarenko that fix is 2mo old? This failed in the last week?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Task Management Issues for anything around the Tasks API - both persistent and node level. medium-risk An open issue or test failure that is a medium risk to future releases Team:Distributed Meta label for distributed team >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

4 participants