Skip to content

Releases: JuliaGPU/CUDA.jl

v1.2.1

31 Jul 08:08
527d364
Compare
Choose a tag to compare

CUDA v1.2.1

Diff since v1.2.0

Closed issues:

  • CuArrays.zeros(T, 0) fails (#81)
  • CUDAnative.cos calls the base cos function in nested broadcast (#102)
  • CuSparseMatrixHYB * CuMatrix = nothing (#256)
  • Strange reordering of struct fields with dynamic parallelism (#263)
  • Performance: bias add (#298)
  • CUDA 11 libraries incorrectly looked up in artifact (#300)
  • CUTENSOR for windows (#301)
  • Performance: sum (#302)
  • Performance: getindex(a, i::Array{Int}) (#303)
  • Display for CuArray within Tuples does not respect :limit=>true (#305)
  • Performance: elementwise operations (#307)
  • Performance: perceptron (#312)
  • windows install error: isfile(__libcupti[]) (#324)
  • std with dims is not type stable (#336)

Merged pull requests:

  • Re-enable threading tests. (#25) (@maleadt)
  • Reorganize and simplify some includes (#296) (@maleadt)
  • Only run benchmarks on the master branch. (#297) (@maleadt)
  • Optimizations for broadcast (#299) (@maleadt)
  • Update manifest (#304) (@github-actions[bot])
  • Test runner improvements for multigpu mode (#309) (@maleadt)
  • Artifact improvements for CUDA 11 on Windows (#310) (@maleadt)
  • Optimize element-wise operations (#313) (@maleadt)
  • Check if reported GPU memory use is available. (#314) (@maleadt)
  • Update artifacts: include cusolverMg, and use Yggdrasil binaries. (#315) (@maleadt)
  • Specialization fixes for mapreducedim. (#316) (@maleadt)
  • Fix invalid conversion of pointer to signed integer. (#317) (@maleadt)
  • Work around (presumed) Windows driver bug in exception test. (#319) (@maleadt)
  • Update manifest (#323) (@github-actions[bot])
  • Bump CUDNN and CUTENSOR (#325) (@maleadt)
  • Simplify NVML discovery. (#326) (@maleadt)
  • Separate CURAND wrappers from Random impl. (#327) (@maleadt)
  • Simplify discovering binaries by using Sys.which. (#328) (@maleadt)
  • Add wrapper for NVML utilization rates. (#329) (@maleadt)
  • Attach CUSPARSE docstrings to bare methods, not empty functions. (#331) (@maleadt)
  • Eagerly reduce the amount of worker threads. (#332) (@maleadt)
  • Bump dependencies. (#333) (@maleadt)
  • Clean-up library wrappers [NFC] (#334) (@maleadt)
  • Fix CUDNN v8 discovery and loading on Windows (#335) (@maleadt)
  • Fix type stability of Statistics.var with dims. (#337) (@maleadt)
  • Fix parameter alignment for dynamic parallelism. (#338) (@maleadt)
  • Micro-optimize Base.fill. (#339) (@maleadt)

v1.2.0

15 Jul 11:07
1c44d7b
Compare
Choose a tag to compare

CUDA v1.2.0

Diff since v1.1.0

Closed issues:

  • Segmentation fault when creating CuArray of CuArray (#133)
  • CUDNN tests fail with CUDNN 6.0.20 (#134)
  • CURAND fail to initialize, code 203 (#255)
  • Deprecation warnings (#277)
  • Can we pleeeeeeeease make cu(x) eltype preserving? (#278)
  • On the use of @sync during benchmarking in the documentation (#279)
  • Example in Multiple GPUs doc fails (#282)
  • LLVM error: Cannot cast between two non-generic address spaces (#286)

Merged pull requests:

v1.1.0

07 Jul 09:07
1c399bf
Compare
Choose a tag to compare

CUDA v1.1.0

Diff since v1.0.2

Closed issues:

  • Fix NSight detection (#29)
  • versioninfo() (#34)
  • throw_... messages: invalid call to jl_alloc_string (#54)
  • INTERNAL_ERROR during CUDNN handle creation (#183)
  • Improve benchmarking suite (#222)
  • How to load CUDA.jl conditional on the computer having a CUDA-compatible GPU? (#237)
  • CUSOLVER.heevd! returning Float and not Complex (#238)
  • Broadcasting fails with Float64 -> Int conversion (#240)
  • Running ] test CUDA with OhMyREPL in startup.jl causes some tests to fail (#246)
  • ERROR: Your LLVM does not support the NVPTX back-end. in local project environment (#249)
  • CUDAnative: UndefVarError: AddrSpacePtr not defined on julia master (#250)
  • Error while freeing CUDA.CuPtr (#254)
  • Non-artifact initialization of CUDA.jl using CUDA 11 fails on Windows (#262)
  • Library handle creation close to OOM fails with ERROR_NOT_INITIALIZED (#264)
  • has(::TargetIterator, name::String) deprecation warning (#271)

Merged pull requests:

  • Add texture support from CuTextures.jl (#209) (@maleadt)
  • Memory pinning with interval trees (#233) (@maleadt)
  • Better nsys detection. (#234) (@maleadt)
  • CompatHelper: add new compat entry for "IntervalTrees" at version "1.0" (#235) (@github-actions[bot])
  • Update manifest (#239) (@github-actions[bot])
  • Replace slash by path separator to properly skip tests on Windows. (#241) (@maleadt)
  • Retry cudnnCreate on CUDNN_STATUS_INTERNAL_ERROR and CUDNN_STATUS_NOT_INITIALIZED (#244) (@maleadt)
  • Add issue templates (#245) (@maleadt)
  • Import wrapper tooling, wrap NVML (#248) (@maleadt)
  • Ignore some potentially unsupported NVML features. (#251) (@maleadt)
  • Assert NVPTX availability by just calling the initializer. (#252) (@maleadt)
  • Update manifest (#257) (@github-actions[bot])
  • Adapt to AddrSpacePtr rename. (#258) (@maleadt)
  • Typo in installation overview docs (#260) (@clintonTE)
  • Update GPUCompiler.jl (#266) (@maleadt)
  • Retry library initialization failure due to (badly reported) OOM. (#268) (@maleadt)
  • Upgrade CUTENSOR to v1.1.0. (#269) (@maleadt)
  • Use CUDNN from Yggdrasil. (#272) (@maleadt)
  • Update manifest (#273) (@github-actions[bot])
  • Improve local CUDA discovery for CUDA 11 (#274) (@maleadt)
  • Compatibility with latest LLVM and GPUCompiler (#275) (@maleadt)

v1.0.2

19 Jun 09:05
3f1e800
Compare
Choose a tag to compare

CUDA v1.0.2

Diff since v1.0.1

Closed issues:

  • Dynamic generation of docs including benchmarking timings can make the numbers "weird" (#11)

Merged pull requests:

  • Documentation updates (#227) (@maleadt)
  • Don't extend Base.findfirst with an unrelated method. (#230) (@maleadt)
  • CompatHelper: bump compat for "NNlib" to "0.7" (#232) (@github-actions[bot])

v1.0.1

18 Jun 15:05
Compare
Choose a tag to compare

v1.0.0

17 Jun 11:05
Compare
Choose a tag to compare

CUDA v1.0.0

Diff since v0.1.0

Closed issues:

  • unsafe_copy3d!: srcPos and dstPos handling (#27)
  • Test failure on Windows (#37)
  • Texture memory? (#46)
  • Tests for the LLVM passes (#52)
  • Bugged Sparse Matrix-Dense matrix multiplication, where dense matrix is transposed (#77)
  • Stack overflow when broadcasting over empty view in CuArrays 2.x (#82)
  • Sparse CSC gemm wrappers actually call CSR routines (#181)
  • Testsuite calls startup.jl (#182)
  • LLVM error: Cannot cast between two non-generic address spaces (#190)
  • Error running CUDA in Jupyter (#195)
  • Floating-point Inf causes an error (#205)
  • mul! issue (#213)

Merged pull requests:

v0.1.0

26 May 16:07
Compare
Choose a tag to compare

CUDA v0.1.0

Closed issues:

  • Documentation: installation instructions (#1)
  • Faced some errors while testing cuda in Julia (#3)
  • facing unknown errors while compiling exact similar code for parallelization on CPU (#7)

Merged pull requests: