Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Acknowledge CUDA_DEVICE_QUERY during GPU selection #216

Merged
merged 4 commits into from
Jul 14, 2021

Conversation

denisalevi
Copy link
Member

@denisalevi denisalevi commented Jul 14, 2021

Before, CUDA_DEVICE_QUERY had not effect for GPU selection in brian2cuda. This PR should fix that.

This needs some tinkering around with the different IDs (the ones reported by nvidia-smi, which ignores CUDA_VISIBLE_DEVICES) and the ones defined by CUDA_VISIBLE_DEVICES (which e.g. deviceQuery takes into account).

Here is the updated doc for the gpu_id preferences. This is how it should work. What is missing is that with prefs....gpu_id = None and CUDA_VISIBLE_DEVICES set, it still ignored CUDA_VISIBLE_DEVICES currently.

    The ID of the GPU that should be used for code execution. Default value is
    `None`, in which case the GPU with the highest compute capability and lowest ID
    is used.

    If this preference is set, it has to be the ID reported by `nvidia-smi`, which
    ignores the environment variable `CUDA_VISIBLE_DEVICES`.

    If this preference isn't set, `CUDA_VISIBLE_DEVICES` is not ignored. E.g. with
    `CUDA_DEVICE_QUERY=1,2` only GPUs 1 and 2 will be considered during GPU
    detection.

We use `nvidia-smi -L` to detect all available GPUs. `nvidia-smi`
displays all GPUs independent of `CUDA_DEVICE_QUERY`, hence setting it
had not effect. Now, `CUDA_DEVICE_QUERY=1` will detect GPU 1 as GPU 0.
This is not implemented yet, but shows how it should work.
CUDA_VISIBLE_DEVICES will now precede any other options. That means if
`gpu_id` is set as `prefs`, it will choose from the visible devices.
@denisalevi
Copy link
Member Author

I thought cudaSetDevice would ignore CUDA_VISIBLE_DEVICES. But it doesn't. Hence the most obvious and easiest solution here is to never ignore CUDA_VISIBLE_DEVICES. This is implemented now. The only place where this was failing was when detecting all available GPUs using nvidia-smi, which ignores CUDA_VISIBLE_DEVICES. Now, detection of all devices checks CUDA_VISIBLE_DEVICES as well.

@denisalevi denisalevi merged commit ad6b820 into master Jul 14, 2021
@denisalevi denisalevi deleted the fix-gpu-detection branch July 14, 2021 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant