cublasgemm-benchmark

A simple and repeatable benchmark for validating the GPU performance based on cublas matrix multiplication.

How to run

Make sure your CUDA tool kit is setup (Your nvcc is on $PATH, shared libraries on $LD_LIBRARY_PATH, headers on $CPATH). Then execute the following command to start the test:

$ ./run.sh

The code does C=alpha*A*B+beta*C with square matrices A, B and C and repeate 2 times (adjustable to test longer for more stable result).
The sizes of A,B and C are upto (16384,16384) in default test (also adjustable to fit your GPU memory size).
The default code runs benchmark for GeForce GTX TITAN BLACK (sm_35) (adjustable) to test with cublasSgemm (can also be cublasHgemm for Pascal GPUs).

Uncomment line 11 in gemm.cu and line 4 in run.sh to test float16 matrix multiplication (cublasHgemm) on Tesla P100 GPU. This needs CUDA 8.0.

Example Testing Result

An example testing result can be found in here.

The "pstate" ranges from P0 to P12 where P0 is the maximum performance and P12 is the minimum performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

cublasgemm-benchmark

How to run

Example Testing Result

See also

Files

README.md

Latest commit

History

README.md

File metadata and controls

cublasgemm-benchmark

How to run

Example Testing Result

See also