Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] Ignore errors from existing destination indices #109853

Closed
prwhelan opened this issue Jun 18, 2024 · 1 comment · Fixed by #109886
Closed

[Transform] Ignore errors from existing destination indices #109853

prwhelan opened this issue Jun 18, 2024 · 1 comment · Fixed by #109886
Assignees
Labels
>bug low-risk An open issue or test failure that is a low risk to future releases :ml/Transform Transform Team:ML Meta label for the ML team

Comments

@prwhelan
Copy link
Member

Description

As a result of this change: #108394

We're seeing a handful of errors related to:

org.elasticsearch.transport.RemoteTransportException: [<>][<>][indices:admin/create]
Caused by: [.metrics-endpoint.metadata_united_default/<>] org.elasticsearch.ResourceAlreadyExistsException: index [.metrics-endpoint.metadata_united_default/<>] already exists
	at org.elasticsearch.cluster.metadata.MetadataCreateIndexService.validateIndexName(MetadataCreateIndexService.java:177)
	at org.elasticsearch.cluster.metadata.MetadataCreateIndexService.validate(MetadataCreateIndexService.java:1371)
	at org.elasticsearch.cluster.metadata.MetadataCreateIndexService.applyCreateIndexRequest(MetadataCreateIndexService.java:345)
	at org.elasticsearch.cluster.metadata.MetadataCreateIndexService$1.execute(MetadataCreateIndexService.java:300)
	at org.elasticsearch.cluster.service.MasterService$UnbatchedExecutor.execute(MasterService.java:551)
	at org.elasticsearch.cluster.service.MasterService.innerExecuteTasks(MasterService.java:1050)
	at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:1015)
	at org.elasticsearch.cluster.service.MasterService.executeAndPublishBatch(MasterService.java:233)
	at org.elasticsearch.cluster.service.MasterService$BatchingTaskQueue$Processor.lambda$run$2(MasterService.java:1656)
	at org.elasticsearch.action.ActionListener.run(ActionListener.java:433)
	at org.elasticsearch.cluster.service.MasterService$BatchingTaskQueue$Processor.run(MasterService.java:1653)
	at org.elasticsearch.cluster.service.MasterService$5.lambda$doRun$0(MasterService.java:1248)
	at org.elasticsearch.action.ActionListener.run(ActionListener.java:433)
	at org.elasticsearch.cluster.service.MasterService$5.doRun(MasterService.java:1227)
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:984)
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.lang.Thread.run(Thread.java:1570)

That should be handled here:

We just need to unwrap the RemoteTransportException and handle the nested ResourceAlreadyExistsException

This is only happening 26 times in the last 2 weeks, and the checkpoint is retried and succeeds, so this seems low risk

@prwhelan prwhelan added >bug :ml/Transform Transform Team:ML Meta label for the ML team low-risk An open issue or test failure that is a low risk to future releases labels Jun 18, 2024
@prwhelan prwhelan self-assigned this Jun 18, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

prwhelan added a commit to prwhelan/elasticsearch that referenced this issue Jun 18, 2024
Exception Handler now unwraps the ResourceAlreadyExistsException when it
originates from a remote location.

Fix elastic#109853
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug low-risk An open issue or test failure that is a low risk to future releases :ml/Transform Transform Team:ML Meta label for the ML team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants