-
Notifications
You must be signed in to change notification settings - Fork 741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement AtomicFAddEXT for the CUDA BE #2853
Comments
I think this is now implemented. It looks like @AGindinson did the meat of this work in 37a9a2a Additionally, relevant libclc support went in in the following PRs: @AGindinson is there anything missing? Perhaps we can close this? |
@AlexeySachkov, could you please help with evaluating this one? |
@AlexeySachkov @AGindinson any updates on this? |
Not really. Both of us are not directly working on CUDA, so this item is a lower priority for us both. Feel free to pick it up. I'm also fine with closing it if we believe that everything is implemented already |
From a quick look into the headers I don't see any |
The `OpSizeOf` instruction was added in SPIR-V 1.1, but not supported yet. Original commit: KhronosGroup/SPIRV-LLVM-Translator@9aeb7eb92d7c0cb
After 4fdbfae, there are preparations to switch atomic
fetch_add
/fetch_sub
FP implementations to using the new SPIR-V operand. Providing a "native" implementation in the CUDA BE would enable us to use the leveraged function for NVPTX targets as well (#if !defined(__NVPTX__)
macros would have to be removed to achieve this).The text was updated successfully, but these errors were encountered: