Skip to content

OpenBLAS 0.3.25 version

Compare
Choose a tag to compare
@martin-frbg martin-frbg released this 12 Nov 21:58
· 1039 commits to release-0.3.0 since this release
5e1a429

general:

  • improved the error message shown on exceeding the maximum thread count
  • improved the code to add supplementary thread buffers in case of overflow
  • fixed a potential division by zero in ?ROTG
  • improved the ?MATCOPY functions to accept zero-sized rows or columns
  • corrected empty prototypes in function declarations
  • cleaned up unused declarations in the f2c-converted versions of the LAPACK sources
  • fixed compilation with the Cray CCE Compiler suite
  • improved link line rewriting to avoid mixed libgomp/libomp builds with clang&gfortran
  • worked around OPENMP builds with LLVM14's libomp hanging on FreeBSD
  • improved the Makefiles to require less option duplication on "make install"
  • imported the following changes from the upcoming release 3.12 of Reference-LAPACK
    • deprecate utility functions ?GELQS and ?GEQRS (LAPACK PR 900)
    • apply rounding up to workspace calculations done in floating point (LAPACK PR 904)
    • avoid overflow in STGEX2/DTGEX2 (LAPACK PR 907)
    • fix accumulation in ?LASSQ (LAPACK PR 909)
    • fix handling of NaN values in ?GECON (LAPACK PR 926)
    • avoid overflow in CBDSQR/ZBDSQR (LAPACK PR 927)
    • fix poor vector orthogonalizations in ?ORBDB5/?UNBDB5 (LAPACK PR 928 & 930)

x86-64:

  • fixed compile-time autodetection of AMD Ryzen3 and Ryzen4 cpus
  • fixed capability-based fallback selection for unknown cpus in DYNAMIC_ARCH
  • added AVX512 optimizations for ?ASUM on Sapphire Rapids and Cooper Lake

ARM64:

  • fixed building on Apple with homebrew gcc
  • fixed building with XCODE 15
  • fixed building on A64FX and Cortex A710/X1/X2
  • increased the default buffer size for recent ARM server cpus

POWER:

  • fixed building with the IBM xlf 16.1.1 compiler
  • fixed building with IBM XL C
  • added support for DYNAMIC_ARCH builds with clang
  • fixed union declaration in the BFLOAT16 test case
  • enable optimizations for the AIX assembler on POWER10

LOONGARCH64:

  • added an optimized SGEMV kernel
  • added an optimized DTRSM kernel

md5sums:
db39b32181b10ec2d1572e81e3dc869c OpenBLAS-0.3.25.zip
48384e324cd1cdcfbdb0d2e16ca55327 OpenBLAS-0.3.25.tar.gz
cc93916bd780a13429b65eb9c05527f2 OpenBLAS-0.3.25-x64.zip
58bb5dfc626d3af86aab7fab409c192d OpenBLAS-0.3.25-x64-64.zip
07a19abeac6c67595ec447315244ccd3 OpenBLAS-0.3.25-x86.zip

Download OpenBLAS