Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about ib_write_lat with CUDA #231

Open
yzygitzh opened this issue Nov 27, 2023 · 2 comments
Open

Question about ib_write_lat with CUDA #231

yzygitzh opened this issue Nov 27, 2023 · 2 comments

Comments

@yzygitzh
Copy link

I've found that ib_write_lat doesn't support CUDA mode.
Wonder whether there is any intrinsic issue that prevents supporting this?
I think it should not be CUDA issue because NCCL library is using IB write with GPU.
If there isn't a big obstacle, I can help draft a PR to fix this.

@elevenxiang
Copy link

I've found that ib_write_lat doesn't support CUDA mode. Wonder whether there is any intrinsic issue that prevents supporting this? I think it should not be CUDA issue because NCCL library is using IB write with GPU. If there isn't a big obstacle, I can help draft a PR to fix this.

Can you share your PR link ?
I remove the error exit, and try to run on A100, it will be crash
and gdb showed that not host memory, so it could be CUDA memory issue

Thanks

@yzygitzh
Copy link
Author

Hi, sorry for misleading. I meant I don’t know the key issue to support write latency for CUDA either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants