-
Notifications
You must be signed in to change notification settings - Fork 23.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce RDMA transport #11182
base: unstable
Are you sure you want to change the base?
Introduce RDMA transport #11182
Conversation
9da176c
to
fafff83
Compare
Hi Redis committee members @yossigo @oranagra. I'm a Ph.D. student at Stanford focusing on cloud computing. In my recent research project, I found this RDMA implementation for Redis to be both impressively organized and performant. |
Sorry for the lack of response here, @pizhenwei my sincere apologies, this PR is still marked in my followup list, but i never got to look at it. |
Hi @oranagra
Ether is fine to me, I volunteer to contribute/maintain this driver. |
Hi @xiezhq-hermann |
I understand that this part must be official in order for libraries to be able to adapt it, but i don't think i have the time or knowledge to make any informed decision about it. |
I agree there should be a formal specification of Redis over RDMA, but I also don't have the knowledge necessary to approach this. @pizhenwei Do you have a rough outine of how such spec needs to look like? For example, are there major decisions we'd need to make first? Are there other notable examples of similar protocols that have an RDMA-based spec? |
Let me enumerate several RDMA-based spec here(as far as I know):
In our proposal, we(with my colleagues @zhuojiang123 @zhangyiming1201 from RDMA team and friend @sigempty from Duke university) introduce a Ring buffer like mechanism for Redis(similar to papers, but not same). I create a new PR to introduce the protocol only. Cc @xiezhq-hermann @Spartee suggestions/feedback are welcomed! |
As a key technology for high-speed networks, RDMA has been deployed on a large scale within ByteDance and has been applied in multiple services. We have been paying attention to the huge performance improvement of RDMA for Redis, and we are willing to provide more supplementary information about RDMA if necessary. We will work with @pizhenwei to promote the optimization of the Redis over RDMA protocol continuously. Looking forward to its final realization. Believe it will play a certain role in various related applications. |
49c0d06
to
5966253
Compare
I am wondering whether performance is the only criterion for evaluation. |
Yes, the performance is NOT good enough. But I'm not against SMC-R for Redis socket. However this is not the entire reason. Generally, the protocol gets more close to the uplayer, the uplayer reaches higher performance.
|
Hi @pizhenwei I work with the OFA on maintaining the FSDP cluster and we were looking to get this patch tested on the appropriate hardware. When working with this patch, it still applies on the unstable branch, but it seems like it uses the variable |
Main changes in this patch: * introduce *Redis Over RDMA* protocol, see *Protocol* section in RDMA.md * implement server side of connection module only, this means we can *NOT* compile RDMA support as built-in. * add necessary information in RDMA.md * support 'CONFIG SET/GET', for example, 'CONFIG Set rdma.port 6380', then check this by 'rdma res show cm_id' and redis-cli(with RDMA support, but not implemented in this patch) * the full listeners show like(): listener0:name=tcp,bind=*,bind=-::*,port=6379 listener1:name=unix,bind=/var/run/redis.sock listener2:name=rdma,bind=xx.xx.xx.xx,bind=yy.yy.yy.yy,port=6379 listener3:name=tls,bind=*,bind=-::*,port=16379 valgrind test works fine: valgrind --track-origins=yes --suppressions=./src/valgrind.sup --show-reachable=no --show-possibly-lost=no --leak-check=full --log-file=err.txt ./src/redis-server --port 6379 --loadmodule src/redis-rdma.so port=6379 bind=xx.xx.xx.xx --loglevel verbose --protected-mode no --server_cpulist 2 --bio_cpulist 3 --aof_rewrite_cpulist 3 --bgsave_cpulist 3 --appendonly no performance test: server side: ./src/redis-server --port 6379 # TCP port 6379 has no conflict with RDMA port 6379 --loadmodule src/redis-rdma.so port=6379 bind=xx.xx.xx.xx bind=yy.yy.yy.yy --loglevel verbose --protected-mode no --server_cpulist 2 --bio_cpulist 3 --aof_rewrite_cpulist 3 --bgsave_cpulist 3 --appendonly no build a redis-benchmark with RDMA support(not implemented in this patch), run on a x86(Intel Platinum 8260) with RoCEv2 interface(Mellanox ConnectX-5): client side: ./src/redis-benchmark -h xx.xx.xx.xx -p 6379 -c 30 -n 10000000 --threads 4 -d 1024 -t ping,get,set --rdma ====== PING_INLINE ====== 480561.28 requests per second, 0.060 msec avg latency. ====== PING_MBULK ====== 540482.06 requests per second, 0.053 msec avg latency. ====== SET ====== 399952.00 requests per second, 0.073 msec avg latency. ====== GET ====== 443498.31 requests per second, 0.065 msec avg latency. Signed-off-by: zhenwei pi <[email protected]>
Hi @JSpewock I also tested this patch, it would be applied after commit c3f8b54(Manage number of new connections per cycle (#12178)). So if you want to apply this patch against a released version, I guest these commands would work successfully:
Any suggestoin&feedback is welcome! Have fun! |
Hi @pizhenwei Out of curiosity, how have you been testing the server when you update the PR? When do a |
Hi, I only test this on RoCE/RXE(soft RoCE of linux), fixes on other platform are welcome! |
Many cloud providers offer RDMA acceleration on their cloud platforms, and I think that there is a foundational basis for the application of Redis over RDMA. We performed some performance tests on this PR on the 8th generation ECS instances (g8ae.2xlarge, 16 vCPUs, 32G DDR) provided by Alibaba Cloud. Test results indicate that, compared to TCP sockets, the use of RDMA can significantly enhance performance. Test command of server side:
Test command of client side:
The performance test results are as shown in the following table. The throughput can be increased by at least 74%, and the average (AVG) and P99 latencies can be reduced by at least 33%.
Besides, I seen the comment from @yossigo that RDMA networks are not easily accessible. If necessary, I could try reaching out to relevant colleagues to see if we can offer some Alibaba Cloud ECS instances to the community for free, so that you can use and test Redis over RDMA, as well as for future CI/CD purposes. |
Hi @hz-cheng I'm happy to see the feedback from the cloud vendor side, this means lots of end-user will enjoy the improvement easily. |
|
Main changes in this patch:
compile RDMA support as built-in.
check this by 'rdma res show cm_id' and redis-cli(with RDMA support,
but not implemented in this patch)
listener0:name=tcp,bind=,bind=-::,port=6379
listener1:name=unix,bind=/var/run/redis.sock
listener2:name=rdma,bind=xx.xx.xx.xx,bind=yy.yy.yy.yy,port=6379
listener3:name=tls,bind=,bind=-::,port=16379
valgrind test works fine:
performance test:
server side:
build a redis-benchmark with RDMA support(not implemented in this patch), run
on a x86(Intel Platinum 8260) with RoCEv2 interface(Mellanox ConnectX-5):
client side:
MR & Issues in history:
ISSUE Support RDMA as tranport layer protocol
MR Support RDMA as tranport layer protocol
MR Support RDMA as tranport layer protocol by rsocket
MR Fully abstract connection and make TLS dynamically loadable