Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When handling appendentries_response, where do the non-raft fields come from? #13

Open
pmembrey opened this issue Aug 7, 2015 · 7 comments

Comments

@pmembrey
Copy link
Contributor

pmembrey commented Aug 7, 2015

Hi,

I've got my implementation to the point where the cluster is now stable, in that a leader can be elected and maintain its authority over the cluster. So far so good :)

However now that I'm implementing raft_recv_appendentries_response I've hit something of a snag. There are two fields (highlighted in the header file as being non-raft) that I don't know how to provide:

/* Non-Raft fields follow: */

/* Non-Raft fields follow: */
/* Having the following fields allows us to do less book keeping in
 * regards to full fledged RPC */
/* This is the highest log IDX we've received and appended to our log */
int current_idx;
/* The first idx that we received within the appendentries message */
int first_idx;

What should I be putting in current_idx and first_idx and where should I get it from?

@pmembrey
Copy link
Contributor Author

pmembrey commented Aug 7, 2015

If I leave them as zero (there are no log entries at this stage committed or otherwise as it's a fresh start), it core dumps with:

raftxmpp: src/raft_log.c:97: log_get_from_idx: Assertion `0 <= idx - 1' failed.

@willemt
Copy link
Owner

willemt commented Aug 7, 2015

Hi Pete

You don't need to populate those two fields. Those fields are populated by the raft_recv_appendentries function

I'll have a look, but I don't think the raft library will result in an operation that results in that assertion firing. It might be caused by the way you're interfacing with the library (this happened to me, when my deserializer wasn't populating the msg_appendentries_response_t fields correctly!)

@pmembrey
Copy link
Contributor Author

pmembrey commented Aug 7, 2015

Also just got this after a period of time running:

raftxmpp: src/raft_server.c:183: raft_recv_appendentries_response: Assertion `0 <= raft_node_get_next_idx(p)' failed.
Aborted (core dumped)

Perfectly willing to believe it's something I'm doing :)

However this is what I'm putting in:

msg_appendentries_response_t  *res = calloc(1,sizeof(msg_appendentries_response_t));
res->success = success; // either 1 or 0 - but the cluster doens't send data so always 1
res->term = term; // Doesn't change often - but in small digits - doesn't die right away
raft_recv_appendentries_response(data->raft_server, other_node, res);

Basically, the whole thing is initialized to zero and if I don't set something in those other fields, it is guaranteed to fail that assert....

@pmembrey
Copy link
Contributor Author

pmembrey commented Aug 7, 2015

I've created a unit test for it here:

https://gist.github.com/pmembrey/f581e503ae303c06098a

Hopefully this will make it obvious what I'm doing wrong :)

I'm running this on Ubuntu...

@pmembrey
Copy link
Contributor Author

pmembrey commented Aug 7, 2015

For the second issue I mentioned, I think it's being caused by a race condition. I'm going to try and replicate it tonight in a unit test. What seems to happen is this:

  • Node A is leader and sends out append entries to Node B
  • Node A receives append entries from Node B with higher term
  • Node A becomes a follower
  • Node A receives an append entries reply
  • Assert fails

Here's the assert again:

raftxmpp: src/raft_server.c:183: raft_recv_appendentries_response: Assertion `0 <= raft_node_get_next_idx(p)' failed.

@willemt
Copy link
Owner

willemt commented Aug 7, 2015

Re: the raft_recv_appendentries_response issue, you should never hand populate the fields. You should use raft_recv_appendentries for this. That's why it's failing in your unit test. This is how to populate it:

msg_appendentries_response_t res;
int e = raft_recv_appendentries(sv->raft, conn->node_idx, &ae, &res);

I will release the example app tonight, so that should help explain things :)

Re: the 2nd issue, that looks interesting, I will have a look at that.

@willemt
Copy link
Owner

willemt commented Aug 10, 2015

Re: the `0 <= raft_node_get_next_idx(p)' bug. I was able to reproduce this on ticketd. I've made the fix here: f297457

Can you please check if you're still experiencing the issue?

Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants