Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle majority of nodes getting replaced? #209

Open
kishorenc opened this issue May 4, 2020 · 1 comment
Open

How to handle majority of nodes getting replaced? #209

kishorenc opened this issue May 4, 2020 · 1 comment

Comments

@kishorenc
Copy link
Contributor

kishorenc commented May 4, 2020

Let's say we have a 3-node cluster consisting of nodes: A (leader), B and C.

  1. A and B are killed and are replaced with new nodes D and E having different IP addresses.
  2. C which is a follower continues trying to reconnect to old leader A.
  3. In the mean time, new nodes D and E form a quorum and elect D as a leader.
  4. The new leader D tries to connect to C but C rejects it, since C's peering configuration does not contain the new IP addresses of D and E yet.
  5. C never recovers and does not rejoin the cluster and I see a lot of reject term_unmatched AppendEntries logs on C.

How should I handle this scenario in my application? How do I dynamically update C's (which is a follower) peering configuration, assuming I already have a service that I can query to get information about the new instances D and E so that C rejoins the cluster?

I tried restarting C process with the new peers but that did not work even when I delete the state directory. So I'm currently terminating node C such that it is replaced by a fresh node with a new IP address, and that works.

@PFZheng
Copy link
Collaborator

PFZheng commented May 22, 2020

You should use add_peer/remove_peer/change_peer interfaces to change the configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants