forked from ray-project/ray
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[core][experimental] Accelerated DAG NCCL-based p2p channels for torc…
…h.Tensors (ray-project#45092) ## Why are these changes needed? This adds a NCCL-based transport option for torch.tensors. Here is an example of the API: ```python with InputNode() as inp: dag = sender.send.bind(inp) dag = dag.with_type_hint(TorchTensorType(SHAPE, DTYPE, transport="nccl")) dag = receiver.recv.bind(dag) compiled_dag = dag.experimental_compile() ``` When `transport="nccl"` is specified, upon compile(), Ray will initialize a NCCL group with the actors involved. The reading actor(s) will `recv` on the NCCL communicator instead of reading from the default shared-memory channel. This PR also refactors channel types so that we now create `ChannelInterfaces` based on the type hints that appear in the DAG, either a `TorchTensorType` or the default `SharedMemoryType`. Current limitations: - p2p only, no collectives - Synchronizes CUDA stream after receiving data. This is because kernels following the NCCL op have no guarantee that the op succeeded, so it is not safe to read the received buffer unless we know that the op succeeded. - Shape and dtype of the tensor must be declared at compile time. --------- Signed-off-by: Stephanie Wang <[email protected]> Co-authored-by: SangBin Cho <[email protected]>
- Loading branch information
1 parent
094748e
commit 79f3995
Showing
19 changed files
with
1,563 additions
and
420 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.