Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASSANDRA-19705: Reconfigure CMS after replacement, bootstrap and move operations #3375

Open
wants to merge 2 commits into
base: trunk
Choose a base branch
from

Conversation

krummas
Copy link
Member

@krummas krummas commented Jun 14, 2024

Thanks for sending a pull request! Here are some tips if you're new here:

  • Ensure you have added or run the appropriate tests for your PR.
  • Be sure to keep the PR description updated to reflect all changes.
  • Write your PR title to summarize what this PR proposes.
  • If possible, provide a concise example to reproduce the issue for a faster review.
  • Read our contributor guidelines
  • If you're making a documentation change, see our guide to documentation contribution

Commit messages should follow the following format:

<One sentence description, usually Jira title or CHANGES.txt summary>

<Optional lengthier description (context on patch)>

patch by <Authors>; reviewed by <Reviewers> for CASSANDRA-#####

Co-authored-by: Name1 <email1>
Co-authored-by: Name2 <email2>

The Cassandra Jira

@krummas krummas requested a review from beobal June 14, 2024 11:34
try
{
if (PrepareCMSReconfiguration.needsReconfiguration(metadata))
reconfigureCMS(ReplicationParams.meta(metadata));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a comment on reconfigureCMS saying it's only to be used for interactive purposes due to its lack of retries. I think this comment is simply outdated as retries is baked into the processor implementations now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah either way I think we're good here - if the reconfiguration fails we'll alert the operator via metrics and nodetool cms

@@ -373,6 +373,20 @@ public void reconfigureCMS(ReplicationParams replicationParams)
InProgressSequences.finishInProgressSequences(ReconfigureCMS.SequenceKey.instance);
}

public void maybeReconfigureCMS(ClusterMetadata metadata)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might actually be better to unconditionally reconfigure as it removes the window between checking if we need to and submitting. It's currently safe to do so anyway as a no-op will add 2 entries to the log, something like:

20 |  72058140398682507 | 2024-06-20 12:40:49.681000+0000 |         ADVANCE_CMS_RECONFIGURATION | AdvanceCMSReconfiguration{idx=0, current=FinishReconfiguration(), diff=Diff{additions=[], removals=[]}, activeTransition=null}
19 |  72058140398682506 | 2024-06-20 12:40:49.631000+0000 | PREPARE_COMPLEX_CMS_RECONFIGURATION | PrepareCMSReconfiguration#Complex{replicationParams=ReplicationParams{class=org.apache.cassandra.locator.MetaStrategy, datacenter1=3}}

but we could reduce this to just the prepare entry if we move the check to PrepareCMSReconfiguration::execute

if (newCms.equals(currentCms))
{
    logger.info("Proposed CMS reconfiguration resulted in no required modifications at epoch {}", prev.epoch.getEpoch());
    return Transformation.success(prev.transformer(), LockedRanges.AffectedRanges.EMPTY);
}

catch (Throwable t)
{
JVMStabilityInspector.inspectThrowable(t);
logger.error("Could not reconfigure CMS, operator should run `nodetool cms reconfigure` to make sure CMS placement is correct", t);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do make a change to submit the reconfig unconditionally, it would be good to still log something like this in the event that the submission fails, maybe at WARN?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants