-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running CUDA samples with multiple GPUs is failed #7852
Comments
@yes89929 your CUDA apps are not seeing the GPU. What is the output of This is sample
|
Thanks for replay.
|
@yes89929 you're using multiple GPUs, therefore it might help troubleshooting by isolating a specific GPU. Please try Also |
@elsaco
|
Same issue here |
I am also having this issue. I'm glad I found this thread, I thought I was going crazy. I'm using 4x A6000, WSL Ubuntu 20.04, CUDA 11.7.1. Any combination of |
Same issue here, I'm using 4 2080TIs, WSL Ubuntu 20.04, kernel version 5.10.102.1-microsoft-standard-WSL2, CUDA 11.3. Combination of device 0 and 1 will result in "cudaGetDeviceCount returned 2 -> out of memory" failure. |
Same issue here, A10 * 4, WSL Ubuntu 20.04, Linux version 5.10.16.3-microsoft-standard-WSL2 (oe-user@oe-host) (x86_64-msft-linux-gcc (GCC) 9.3.0, CUDA 12.0. "cudaGetDeviceCount returned 2 -> out of memory" failure occurs when i set |
I have this same issue with 4x 3090s on Windows 11 and Ubuntu 22.04.2 Cuda 12.2 I can run CUDA_VISIBLE_DEVICES=0,2,3, but if I put 1 in there, I get out of memory exception Linux Version: 5.15.90.1-microsoft-standard-WSL2 |
Running latest on WSL2, Ubuntu 22.04, CUDA 12.2. 4x RTX 6000 Ada. Similar to above, any combination of 1 and 3 fail. |
my linux Linux AICADS 5.15.90.1-microsoft-standard-WSL2 #1 SMP Fri Jan 27 02:56:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux 0,1,2 success but 0,1,2,3 fail |
Oh my god! Dude, you solved a long problem with not being able to use GPUs in docker! Thank you so much! After I isolate a GPU, I can use GPU in docker. Why is that??? What is the reason? |
I am having this issue now, almost exactly as described. Any combination of devices that don't include both 1 and 2 work. |
Had the same issue with multi gpu setup in WSL. I guess the issue was NVLINK. I set the SLI configuration in NVIDIA Control Panel (in Windows) to 'Maximize 3D performance'. Now it finally works! Hope this helps someone - this thread helped me figuring it out... |
same problem too, 4*gpu on wsl2 cannot work together. |
同样的问题,2080TI X 4,最新的Windows11系统、最新的WSL2版本、最新的Windows端Nvidia驱动、正确安装的cudatoolkit以及cudnn;仍然出现类似问题,在Windows中可以通过示例测试,WSL2中以及docker中均不可以 |
Same issue, 3x 3090 cannot work together perfectly with this same error. "0,2" and "0,1,2" are fine. "0,1" is not fine. How does this make sense? |
Setting the SLI configuration in the NVIDIA Control Panel (in Windows) to 'Maximize 3D performance' worked for me!!! But, it solve the problem partially because the problem of monitoring the memory usage in nvidia-smi still remains. |
Version
Microsoft Windows [Version 10.0.22000.376]
WSL Version
Kernel Version
5.10.60.1
Distro Version
Ubuntu 20.04 and Ubuntu 18.04
Other Software
CPU: Intel(R) Core(TM) i9-9900X
GPU: Nvidia Titan RTX * 4 (driver 510.06)
RAM: 128GB
Repro Steps
Install CUDA on WSL
Run samples
Expected Behavior
Return success
Actual Behavior
Diagnostic Logs
No response
The text was updated successfully, but these errors were encountered: