Build Your Own Redis: Concurrent Clients [3/4]
This is the third article in a series where we’ll build a toy Redis clone in Ruby. If you’d like to code-along, try the Build your own Redis challenge!
Previous article: Ping <-> Pong
Next article: ECHO
Sections in this article:
- Bug: Multiple commands from the same client
- Using threads to serve multiple clients
- Event Loops: an analogy
- Blocking vs. Non-blocking calls
- IO#Select
- Implementing the event loop
Bug: Multiple commands from the same client
In the previous article, we worked on replying to the PING
command.
Here’s what the code for our server looked like:
require "socket"
class RedisServer
def initialize(port)
@server = TCPServer.new(port)
end
def listen
loop do
client = @server.accept
# TODO: Handle commands other than PING
client.write("+PONG\r\n")
end
end
end
Re-read the code above, and try to answer this: What’ll happen when a client
sends their second PING
command?
Here’s an integrated test for that question:
require "redis" require "minitest/autorun" # 6379 for official redis, 6380 for ours SERVER_PORT = ENV["SERVER_PORT"] class TestRedisServer < Minitest::Test def test_responds_to_ping r = Redis.new(port: SERVER_PORT) assert_equal "PONG", r.ping end+ + def test_multiple_commands_from_same_client+ r = Redis.new(port: SERVER_PORT)+ + # The Redis client re-connects on timeout by default, without_reconnect+ # prevents that.+ r.without_reconnect do+ assert_equal "PONG", r.ping+ assert_equal "PONG", r.ping+ end+ end end
Ready? (If you haven’t taken time to think about the answer, do so now!)
Here’re the test results:
➜ SERVER_PORT=6380 ruby _files/redis/integrated_test_2.rb
Run options: --seed 32401
# Running:
E.
1) Error:
TestRedisServer#test_multiple_commands_from_same_client:
Redis::TimeoutError: Connection timed out
/home ... /ruby.rb:71:in `rescue in _read_from_socket'
...
_files/redis/integrated_test_2.rb:18:in `test_multiple_commands_from_same_client'
2 runs, 2 assertions, 0 failures, 1 errors, 0 skips
The client doesn’t receive the second reply!
This happens because after the first @server.accept
call, we send PONG
and
then forget about the client! When the client sends their next PING
command, there’s no one listening on the other end.
Using threads to serve multiple clients
Let’s try another approach where we don’t abandon the client after sending
PONG
. Instead, we’ll keep listening to the client for further commands.
require "socket" class RedisServer def initialize(port) @server = TCPServer.new(port) end def listen loop do client = @server.accept+ handle_client(client)+ end+ end+ + def handle_client(client)+ loop do+ client.gets+ # TODO: Handle commands other than PING client.write("+PONG\r\n") end end end
This satisfies the use case of a single client that wants to send multiple commands.
There’s still a huge problem here though. When we’re busy servicing one client,
we never end up calling @server.accept
, so we don’t accept any new clients.
At a given point in time, we’ve got two things we need to be doing:
- Accept new clients
- Wait on commands from existing clients
Using multiple threads would solve this. Every time we receive a new client, we’ll spawn a new thread that’ll keep on listening to commands from the client until it disconnects.
Let’s quickly write up a test for this:
require "redis" require "minitest/autorun" # 6379 for official redis, 6380 for ours SERVER_PORT = ENV["SERVER_PORT"] class TestRedisServer < Minitest::Test def test_responds_to_ping r = Redis.new(port: SERVER_PORT) assert_equal "PONG", r.ping end def test_multiple_commands_from_same_client r = Redis.new(port: SERVER_PORT) # The Redis client re-connects on timeout by default, without_reconnect # prevents that. r.without_reconnect do assert_equal "PONG", r.ping assert_equal "PONG", r.ping end end+ + def test_multiple_clients+ r1 = Redis.new(port: SERVER_PORT)+ r2 = Redis.new(port: SERVER_PORT)+ + assert_equal "PONG", r1.ping+ assert_equal "PONG", r2.ping+ end end
Followed by the implementation that uses threads:
require "socket" class RedisServer def initialize(port) @server = TCPServer.new(port) end def listen loop do client = @server.accept handle_client(client) end end def handle_client(client)- loop do- client.gets+ Thread.new do+ loop do+ client.gets - # TODO: Handle commands other than PING- client.write("+PONG\r\n")+ # TODO: Handle commands other than PING+ client.write("+PONG\r\n")+ end end end end
This implementation satisfies all our tests:
➜ SERVER_PORT=6380 ruby _files/redis/integrated_test_3.rb
Run options: --seed 21893
# Running:
...
3 runs, 5 assertions, 0 failures, 0 errors, 0 skips
Event Loops: an analogy
Unlike our threaded implementation, the official Redis implementation serves multiple clients using a single thread, not multiple.1
It achieves this by using an event loop.
Think of your program as a bartender who has to serve multiple clients. When you ask your operating system to start up another thread, you’re essentially asking it to create another bartender. As long as you have as many bartenders as clients, every client will be served.
There’s a certain amount of overhead involved here. Each thread (i.e. bartender) isn’t cheap to spawn, and more threads leads to more time spent switching between threads.
What if, instead of reserving one bartender per client, we tried to serve all clients with just one bartender? This should work, as long as the bartender manages to accept new customers and to serve existing ones in time. That’s what an event loop does.
Your program reacts to multiple events in a loop - each action tends to be so quick that you give away the ‘illusion’ of serving multiple clients at a time. What you really did though, is split up the work into chunks and executed them one-by-one.
Going back to the bartender analogy: Let’s say we decide to go with the single bartender approach. What happens if the bartender ends up chatting with a single customer for a long period of time?
Other customers won’t be serviced, and new customers won’t be accepted either. For this to run well, it is imperative that the bartender doesn’t spend too much time servicing a single customer.
This brings us to the cardinal rule of event loops: never block the event loop.
To run on an event loop, we’ll have to make sure that our program is capable of servicing each event quickly.
Blocking vs. Non-blocking calls
Before we get into implementing an event loop, let’s take a while to understand blocking vs. non-blocking calls.
➜ rohitpaulk.com git:(master) ✗ irb
irb(main):001:0> q = Queue.new
=> #<Thread::Queue:0x000055db24365740>
irb(main):002:0> q.push(1)
=> #<Thread::Queue:0x000055db24365740>
irb(main):003:0> q.pop
=> 1
irb(main):004:0> q.pop
.
..
... Blocked, Indefinitely!
A blocking call is one that only returns until the result is ready. The amount of time spent here can be indefinite, although in most cases it is likely that a timeout is configured.
If a blocking call ends up in our event loop, we’ll end up breaking the ‘never block the event loop’ rule!
Now that we know that blocking calls are deadly, let’s re-visit our server code:
require "socket" class RedisServer def initialize(port) @server = TCPServer.new(port) end def listen loop do+ # DANGER: blocks until a new client is ready to connect! client = @server.accept handle_client(client) end end def handle_client(client) Thread.new do loop do+ # DANGER: blocks until a newline character is read! client.gets # TODO: Handle commands other than PING client.write("+PONG\r\n") end end end end
Here are the calls we’ve used so far that are blocking:
TCPServer#accept
- blocks until a new client is ready to connect
IO#gets
- blocks until a newline character is read
For each of these blocking calls, we’ll need to do one of the following:
- Replace it with a non-blocking variant (like a method that returns
nil
when the result isn’t ready) - Only make the blocking call if we know (through some other mechanism) that the result will be ready (and hence the call won’t have to block)
IO#Select
Ruby’s IO#select
(backed by the
select
syscall) can help
in reducing these two calls into non-blocking versions.
IO#select
takes in a list of file descriptors, and blocks until one or more of
them is ready to be read from. It then returns the subset of file descriptors
that are ready to be read from. Thanks to Unix’s “everything is a file”
architecture, we can use this mechanism to wait on both a new client wanting to
connect and an established socket that has new data to be read from.
# This call blocks until one or more of `fds_to_watch` is ready to read from
ready_to_read, _, _ = IO.select(fds_to_watch, _, _)
# Once the call returns, `ready_to_read` will contain a subset of `fds_to_watch`
Let’s see this in context of our TCP server:
serv = TCPServer.new(6380)
client1 = serv.accept
client2 = serv.accept
fds_to_watch = [serv, client1, client2]
# This blocks till either:
#
# - `serv` has another client ready to connect
# - `client1` has data ready to be read
# - `client2` has data ready to be read
ready_to_read, _, _ = IO.select(fds_to_watch, _, _)
ready_to_read.each do { |ready|
if ready == serv
# if the server is ready to read from,
# that means that a new client is ready
# to connect.
else
# If not the server, this must be either
# `client1` or `client2`
end
}
Implementing the event loop
First, let’s look at the events we want to listen on:
- A new client is ready to connect
- An existing client has new information to send
Whenever one of these events happen, we want to perform an action, which should be non-blocking.
require "socket" class RedisServer def initialize(port) @server = TCPServer.new(port)+ @clients = [] end def listen loop do- client = @server.accept- handle_client(client)+ fds_to_watch = [@server, *@clients]+ ready_to_read, _, _ = IO.select(fds_to_watch)+ ready_to_read.each do |ready|+ if ready == @server+ @clients << @server.accept+ else+ # If not the server, this must be one of the existing clients+ handle_client(client)+ end+ end end end def handle_client(client)- Thread.new do- loop do- client.gets+ client.readpartial(1024) # TODO: Read actual command - # TODO: Handle commands other than PING- client.write("+PONG\r\n")- end- end+ # TODO: Handle commands other than PING+ client.write("+PONG\r\n") end end
We’ve kept server.accept
as-is, but we only call it when we know it is ready
to read - so that shouldn’t result in a non-blocking call.
We’ve replaced
IO.gets
with
IO.readpartial
,
which is a non-blocking variant. We’re still not parsing input though - we’ll
re-visit this in the next article, when we actually decode Redis commands.
Running our tests to make sure they pass against this implementation…
➜ rohitpaulk.com git:(master) ✗ SERVER_PORT=6380 ruby _files/redis/integrated_test_3.rb
Run options: --seed 21893
# Running:
...
Finished in 0.006090s, 492.6094 runs/s, 821.0156 assertions/s.
3 runs, 5 assertions, 0 failures, 0 errors, 0 skips
And they do!
In the next article, we’ll delve into parsing RESP and
implement the ECHO
command.
-
Not entirely accurate, Redis does use threads for some background tasks ↩