Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-Place Reduction for NCCL #259

Open
cat-state opened this issue Jun 13, 2024 · 1 comment
Open

In-Place Reduction for NCCL #259

cat-state opened this issue Jun 13, 2024 · 1 comment

Comments

@cat-state
Copy link

NCCL supports all-reduce in place, however Comm::all_reduce takes in a &CudaSlice to read from and a &mut CudaSlice to write into, which doesn't allow in-place reduction.

@coreylowman
Copy link
Owner

Ah yeah I see that (cuda docs: https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/api/colls.html#c.ncclReduce)

I think in this case due to rust's borrow rules it'd probably be easiest to just add Comm::all_reduce_in_place that takes a &mut CudaSlice. Fairly easy add if anyone wants to contribute a PR for this! Otherwise I can add later this week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants