Skip to content
View wjc404's full-sized avatar

Block or report wjc404

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
  • SGEMM and DGEMM subroutines using AVX512F instructions.

    C 12 1 GNU General Public License v3.0 Updated May 22, 2022
  • Topk with K = 16 or 32, based on bitonic sort algorithm, using Intel AVX instructions.

    C++ MIT License Updated Jan 12, 2022
  • OpenBLAS Public

    Forked from OpenMathLib/OpenBLAS

    OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

    Fortran BSD 3-Clause "New" or "Revised" License Updated Dec 12, 2021
  • Sgemm kernel function on Nvidia Pascal GPU, able to achieve 60% theoretical performance.

    Cuda 5 1 GNU General Public License v3.0 Updated Aug 2, 2020
  • 1 GNU General Public License v3.0 Updated Mar 23, 2020
  • GEMM_AVX2 Public

    Fast avx2/fma3 dgemm and sgemm subroutines for medium to large matrices(>2000*2000) on haswell/skylake/zen processors, with performances comparable to MKL.

    C 6 1 GNU General Public License v3.0 Updated Mar 23, 2020
  • cgemm and zgemm subroutines for large matrices, using avx2 and fma3 instructions, with performance comparable to MKL2018

    C GNU General Public License v3.0 Updated Feb 27, 2020
  • cgemm3m and zgemm3m subroutines for large matrices, using AVX2 and FMA3 instructions.

    C GNU General Public License v3.0 Updated Sep 8, 2019
  • GEMM_AVX2_FMA3 Public archive

    sgemm and dgemm subroutine for large matrices, slightly outperform Intel MKL

    C 1 1 GNU General Public License v3.0 Updated Aug 27, 2019
  • how to design cpu gemm on x86 with avx256, that can beat openblas.

    C++ MIT License Updated Apr 15, 2019