We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug
This is isolated CUDA.jl broadcast example that fails when two arrays are of different type.
To reproduce
The Minimal Working Example (MWE) for this bug:
using CUDA CUDA.rand(Float32,30,30) .+ cu(rand(Float32,30,30,1), unified=true)
Also initially posted FluxML/NNlib.jl#568 as I got it wrong how the dispatch behaves for .+ when dimensions are the same
.+
CUDA 5.2.0
Expected behavior
Broadcast should produce an array, which it does in case of CPU CPU arrays.
rand(Float32,30,30) .+ rand(Float32,30,30,1)
Version info
Details on Julia:
Julia Version 1.10.2 Commit bd47eca2c8a (2024-03-01 10:14 UTC) Build Info: Official https://julialang.org/ release Platform Info: OS: Linux (x86_64-linux-gnu) CPU: 32 × 13th Gen Intel(R) Core(TM) i9-13980HX WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-15.0.7 (ORCJIT, goldmont) Threads: 1 default, 0 interactive, 1 GC (on 32 virtual cores)
Details on CUDA:
CUDA runtime 12.3, artifact installation CUDA driver 12.0 NVIDIA driver 525.147.5 CUDA libraries: - CUBLAS: 12.3.4 - CURAND: 10.3.4 - CUFFT: 11.0.12 - CUSOLVER: 11.5.4 - CUSPARSE: 12.2.0 - CUPTI: 21.0.0 - NVML: 12.0.0+525.147.5 Julia packages: - CUDA: 5.2.0 - CUDA_Driver_jll: 0.7.0+1 - CUDA_Runtime_jll: 0.11.1+0 Toolchain: - Julia: 1.10.2 - LLVM: 15.0.7 1 device: 0: NVIDIA GeForce RTX 4090 Laptop GPU (sm_89, 14.758 GiB / 15.992 GiB available)
The text was updated successfully, but these errors were encountered:
Successfully merging a pull request may close this issue.
Describe the bug
This is isolated CUDA.jl broadcast example that fails when two arrays are of different type.
To reproduce
The Minimal Working Example (MWE) for this bug:
Also initially posted FluxML/NNlib.jl#568 as I got it wrong how the dispatch behaves for
.+
when dimensions are the sameExpected behavior
Broadcast should produce an array, which it does in case of CPU CPU arrays.
Version info
Details on Julia:
Details on CUDA:
The text was updated successfully, but these errors were encountered: