consider changing request lifecycle #68

vsreekanti · 2018-07-05T22:45:41Z

A few months back the request lifecycle looked something like this:

User sends request to routing layer.
Routing layer forwards the request to the server.
Server responds to routing.
Routing responds to user.

We concluded that this was bad because this meant that the (potentially large) value now made two hops: Server to routing and routing to user. The simple solution at the time was to make the routing layer simply respond with the addresses of the correct server and have the user communicate directly with the server. The result was a request lifecycle that looks like this:

User sends a key to the routing layer.
The routing layer responds with the addresses of the set of servers responsible for this key.
The user caches these addresses and sends the request directly to the server.
The server responds.
The user relies on its cache for future requests until a server receives a message for a key it's not responsible for. The server then tells the user to invalidate its cache for that key and return to step 1.

However, we’ve since changed the architecture of the user & routing components to make them asynchronous. When a user sends a request, it includes it’s IP address in the request so that the routing layer knows where to respond to, because it might have to make a request to determine the correct replication factor for the key. In light of that change, we should change the request pattern:

User sends request to routing layer.
Routing layer forwards request along to one of the correct servers.
Server responds directly to user.

Pros:

This reduces the maximum number of roundtrips for a request from 4 to 3.
This could potentially be even faster if the routing layer is colocated with the storage layer (see consider having routing and server in a single process #67).

Cons:

Forces the routing layer to be invoked on every single request. i.e., the minimum number of requests goes from 2 to 3. Whether this is a net negative or positive depends on the workload.

cc @cw75

cw75 · 2018-07-05T22:51:54Z

For PUT requests, the VALUE will be passing through the routing thread I guess? This could be another drawback. @vsreekanti

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consider changing request lifecycle #68

consider changing request lifecycle #68

vsreekanti commented Jul 5, 2018

cw75 commented Jul 5, 2018 •

edited

Loading

consider changing request lifecycle #68

consider changing request lifecycle #68

Comments

vsreekanti commented Jul 5, 2018

cw75 commented Jul 5, 2018 • edited Loading

cw75 commented Jul 5, 2018 •

edited

Loading