Skip to content

Latest commit

 

History

History
35 lines (22 loc) · 1.94 KB

README.md

File metadata and controls

35 lines (22 loc) · 1.94 KB

cublasgemm-benchmark

A simple and repeatable benchmark for validating the GPU performance based on cublas matrix multiplication.

How to run

Make sure your CUDA tool kit is setup (Your nvcc is on $PATH, shared libraries on $LD_LIBRARY_PATH, headers on $CPATH). Then execute the following command to start the test:

$ ./run.sh
  • The code does C=alpha*A*B+beta*C with square matrices A, B and C and repeate 2 times (adjustable to test longer for more stable result).

  • The sizes of A,B and C are upto (16384,16384) in default test (also adjustable to fit your GPU memory size).

  • The default code runs benchmark for GeForce GTX TITAN BLACK (sm_35) (adjustable) to test with cublasSgemm (can also be cublasHgemm for Pascal GPUs).

Uncomment line 11 in gemm.cu and line 4 in run.sh to test float16 matrix multiplication (cublasHgemm) on Tesla P100 GPU. This needs CUDA 8.0.

Example Testing Result

An example testing result can be found in here.

The "pstate" ranges from P0 to P12 where P0 is the maximum performance and P12 is the minimum performance.

See also