Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ilm race condition #109230

Closed
wants to merge 3 commits into from
Closed

Conversation

sscimff
Copy link

@sscimff sscimff commented May 31, 2024

Fixes race condition in ILM Shrink action (#109206). Ensures ILM execution state is copied before execution, improving stability.

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v8.15.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels May 31, 2024
@sscimff sscimff closed this May 31, 2024
@sscimff sscimff reopened this May 31, 2024
Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensures ILM execution state is copied before execution, improving stability

I am not sure I understand this, can you elaborate? This does not look like it copies execution state, rather it waits for an always true condition AFAICS.

We also need a test demonstrating the issue, will you add that too?

@@ -30,6 +30,7 @@
*/
public class ClusterStateObserver {

public static org.elasticsearch.xpack.core.ml.MlMetadata.Builder ChangePredicate;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert this?

}
private boolean shrunkenIndexAllocated(ClusterState clusterState, String shrunkenIndexName) {
IndexMetadata indexMetadata = clusterState.metadata().index(shrunkenIndexName);
return indexMetadata != null && indexMetadata.getState() == IndexMetadata.State.OPEN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not ensure that the index is allocated, only that it exists and is open.

getClient().admin().indices().resizeIndex(resizeRequest, listener.delegateFailure((l, response) -> {
ClusterStateObserver.Listener clusterStateListener = new ClusterStateObserver.Listener() {
@Override
public void onNewClusterState(ClusterState state) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the right way to use an observer, instead you should pass the condition as a predicate to waitForNextChange.

@nielsbauman nielsbauman added :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team and removed needs:triage Requires assignment of a team area label labels May 31, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@nielsbauman
Copy link
Contributor

Hi @sscimff, thank you for your interest in Elasticsearch and for taking the time to open a PR! I see you haven't addressed @henningandersen's comments yet. Are you still planning to work on this PR? The reason I'm asking is that we've discussed this issue (#109206) internally, and we're inclined to go with a different approach from the one you've taken here.

If you're still interested in fixing this issue, I'm happy to explain our proposed solution to you. Otherwise, I'll close your PR and open one of my own.

@sscimff
Copy link
Author

sscimff commented Jun 20, 2024

Hi @nielsbauman thank you for the message. If you have another approach and decide to implement it. Please go ahead and close this PR.

@nielsbauman
Copy link
Contributor

We've decided internally on a different approach for this issue (see #109206 (comment)). We'll be picking this up internally as well.

I'm going to close this PR, thanks nonetheless for your efforts @sscimff!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Data Management Meta label for data/management team v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants