Skip to content

Releases: JuliaGPU/CUDA.jl

v2.2.1

13 Nov 11:43
1596a2c
Compare
Choose a tag to compare

v2.2.0

13 Nov 09:00
Compare
Choose a tag to compare

CUDA v2.2.0

Diff since v2.1.0

Closed issues:

  • cudnn missing after downloading artifact (#521)
  • Downloading artifact: CUDA110 when using DiffEqFlux (#542)

Merged pull requests:

v2.1.0

30 Oct 12:11
602c549
Compare
Choose a tag to compare

CUDA v2.1.0

Diff since v2.0.2

Closed issues:

  • CUDNN convolution with Float16 always returns zeros (#92)
  • axp(b)y! and mul! (scalar multiplication) with mixed argument types (#144)
  • Dispatching to generic matmul instead of CUBLAS (#164)
  • Support for Ints and Float16? (#165)
  • Subarrays/views support (#172)
  • Easy way to pick among multiple GPUs (#174)
  • More prominently document JULIA_CUDA_USE_BINARYBUILDER (#204)
  • ERROR_COOPERATIVE_LAUNCH_TOO_LARGE during tests (#247)
  • Pkg.test error for cutensor test on Windows (#422)
  • Runtime build improvements (#456)
  • Fusing Wrappers (#467)
  • Could not find nvToolsExt (libnvToolsExt.dylib.1.0 or libnvToolsExt.dylib.1) in /Users/imac/.julia/artifacts/b502baf54095dff4a69fd6aba8667124583f6929/lib (#482)
  • mapreduce assumes commutative op (#484)
  • SubArray Broadcast Bug in 2.0 (#488)
  • Nested SubArray Scalar Indexing (#490)
  • Sparse matrix * view(vector) regression in 2.0 (#493)
  • Error transforming a reshaped 0-dimentional GPU array to a CPU array (#494)
  • test cuda FAILURE (#496)
  • Reshaped CuArray is not DenseCuArray (#511)
  • assignment failure when using array slicing. (#516)

Merged pull requests:

v2.0.2

15 Oct 14:14
Compare
Choose a tag to compare

CUDA v2.0.2

Diff since v2.0.1

Closed issues:

  • cu() behavior for complex floating point numbers (#91)
  • Error when following example on using multiple GPUs on multiple processes (#468)
  • MacOS without nvidia GPU is trying to download CUDA111 on julia nightly (#469)
  • Drop BinaryProvider? (#474)
  • Latest version of master doesn't work on Windows (#477)
  • sum(CUDA.rand(3,3)) broken (#480)
  • copyto!() between cpu and gpu with subarrays (#491)

Merged pull requests:

v2.0.1

05 Oct 08:12
Compare
Choose a tag to compare

CUDA v2.0.1

Diff since v2.0.0

Closed issues:

  • Can't update (#462)

Merged pull requests:

  • Remove duplicate comment (#464) (@blegat)
  • Add functionality to precompile the runtime library. (#465) (@maleadt)
  • Update manifest (#470) (@github-actions[bot])

v2.0.0

02 Oct 07:12
70d93cc
Compare
Choose a tag to compare

CUDA v2.0.0

Diff since v1.3.3

Closed issues:

  • Test failure during threading tests (#15)
  • Bad allocations in memory pool after device_reset! (#16)
  • CuArrays can lose Blas on reshaped views (#78)
  • allowscalar performance (#87)
  • Indexing with a CuArrays causes a 'scalar indexing disallowed' error from checkbounds (#90)
  • 5-arg mul! for CUSPARSE (#98)
  • copyto!(Device, Host) uses scalar iteration in case of type mismatch (#105)
  • Array primitives broken for CUSPARSE arrays (#113)
  • SplittingPool: CPU allocations (#117)
  • error while concatenating to an empty CuArray (#139)
  • Showing sparse arrays goes wrong (#146)
  • Improve test coverage (#147)
  • CuArrays allocates a lot of memory on the default GPU (#153)
  • [Feature Request] Indexing CuArray with CuArray (#155)
  • Reshaping CuArray throws error during backpropagation (#162)
  • Match syntax and APIs against Julia 1.0 standard libraries (#163)
  • CURAND_STATUS_PREEXISTING_FAILURE when setting seed multiple times. (#212)
  • RFC: converts SparseMatrixCSC to CuSparseMatrixCSR via cu by default (#216)
  • Add a CuSparseMatrixCOO type (#220)
  • Test runner stumbles over path separators (#236)
  • Error: Invalid bitcode signature when loading CUDA.jl after precompilation (#293)
  • Atomic operations only work on global memory (#311)
  • Performance: cudnn algorithm selection (#318)
  • CUSPARSE is broken in CUDA.jl 1.2 (#322)
  • Device-side broadcast regression on 1.5 (#350)
  • API for fast math-like mode (#354)
  • CUDA 11.0 Update 1: cublasSetWorkspace (#365)
  • Can't precompile CUDA.jl on Kubuntu 20.04 (#396)
  • CuPtr should be Ptr in cudnnGetDropoutDescriptor (#397)
  • CUDA throws OOM error when initializing API on multiple devices (#398)
  • Cannot launch kernel with > 5 args using Dynamic Parallelism (#401)
  • Reverse performance regression (#410)
  • Tag for LLVM 3? (#412)
  • CUDA not working (#415)
  • StatsBase.transform fails on CuArray (#426)
  • Further unification of CUBLAS.axpy! and LinearAlgebra.BLAS.axpy! (#432)
  • size(range), length(range) and range[end] fail inside CUDA kernels (#434)
  • InitError: Cannot use memory pool 'binned' when CUDA.jl was precompiled for memory pool 'split'. (#446)
  • Missing dispatch for matrix multiplication with views? (#448)
  • New version not available yet? (#452)
  • using CUDA or CUArray, output: UndefVarError: AddrSpacePtr not defined (#457)
  • Unable to upgrade to the latest version (#459)

Merged pull requests:

v1.3.3

25 Aug 11:08
be21077
Compare
Choose a tag to compare

CUDA v1.3.3

Diff since v1.3.2

Closed issues:

  • Type changing Array conversions give error when allowscalar(false) (#344)
  • getindex(::CuArray, ::Adjoint, ::Colon) fails (#345)
  • View with array indices causes memory copy before broadcast (#384)
  • Regression with Julia 1.5 (#390)

Merged pull requests:

v1.3.2

24 Aug 07:09
Compare
Choose a tag to compare

CUDA v1.3.2

Diff since v1.3.1

Closed issues:

  • LLVM WMMA errors (#380)

Merged pull requests:

  • Fix handling of tests to skip. (#386) (@maleadt)
  • Update manifest (#387) (@github-actions[bot])

v1.3.1

22 Aug 07:11
Compare
Choose a tag to compare

CUDA v1.3.1

Diff since v1.3.0

Closed issues:

  • Element-wise conversion fails (#378)
  • atomic_min fails for Int32 in global CuDeviceArrays (#379)
  • Segmentation fault from @cuprint on char (#381)
  • error in versioninfo(), name not defined (#385)

Merged pull requests:

v1.3.0

19 Aug 13:09
e48d0dc
Compare
Choose a tag to compare

CUDA v1.3.0

Diff since v1.2.1

Closed issues:

  • Trouble with the @. macro (#346)
  • NVMLError: Not Supported (code 3) (#348)
  • Nvidia Xavier devices: exception thrown during kernel execution on device Xavier (#349)
  • Could not load CUTENSOR artifact dll on Windows 10 (#355)
  • CuTextureArray for 3D array (#357)
  • Bug in julia 1.5.0 I have CUDA 11.0 installed in Ubuntu 18.04 (#360)
  • Callback-based logging (#366)
  • Artifact download timeout (#369)
  • sum! accumulates when called multiple times (#370)
  • nvprof does not detect kernel launches (#371)
  • KernelError: passing and using non-bitstype argument (#372)
  • CUDA.jl fails to find libcudadevrt.a due on a cluster install with multi-arch target (#376)

Merged pull requests: