Skip to content


Repository files navigation


Ginkgo is a high-performance linear algebra library for manycore systems, with a focus on sparse solution of linear systems. It is implemented using modern C++ (you will need at least C++11 compliant compiler to build it), with GPU kernels implemented in CUDA.


An extensive database of up-to-date benchmark results is available in the performance data repository. Visualizations of the database can be interactively generated using the Ginkgo Performance Explorer web application. The benchmark results are automatically updated using the CI system to always reflect the current state of the library.


Linux and Mac OS

For Ginkgo core library:

  • cmake 3.9+
  • C++11 compliant compiler, one of:
    • gcc 5.3+, 6.3+, 7.3+, 8.1+
    • clang 3.9+
    • Apple LLVM 8.0+ (TODO: verify)

The Ginkgo CUDA module has the following additional requirements:

In addition, if you want to contribute code to Ginkgo, you will also need the following:

  • clang-format 5.0.1+ (ships as part of clang)


Windows is currently not supported, but we are working on porting the library there. If you are interested in helping us with this effort, feel free to contact one of the developers. (The library itself doesn't use any non-standard C++ features, so most of the effort here is in modifying the build system.)

TODO: Some restrictions will also apply on the version of C and C++ standard libraries installed on the system. We need to investigate this further.


Use the standard cmake build procedure:

mkdir build; cd build
cmake -G "Unix Makefiles" [OPTIONS] .. && make

Replace [OPTIONS] with desired cmake options for your build. Ginkgo adds the following additional switches to control what is being built:

  • -DGINKGO_DEVEL_TOOLS={ON, OFF} sets up the build system for development (requires clang-format, will also download git-cmake-format), default is ON

  • -DGINKGO_BUILD_TESTS={ON, OFF} builds Ginkgo's tests (will download googletest), default is ON

  • -DGINKGO_BUILD_BENCHMARKS={ON, OFF} builds Ginkgo's benchmarks (will download gflags and rapidjson), default is ON

  • -DGINKGO_BUILD_EXAMPLES={ON, OFF} builds Ginkgo's examples, default is ON

  • -DGINKGO_BUILD_REFERENCE={ON, OFF} build reference implementations of the kernels, useful for testing, default is OFF

  • -DGINKGO_BUILD_OMP={ON, OFF} builds optimized OpenMP versions of the kernels, default is OFF

  • -DGINKGO_BUILD_CUDA={ON, OFF} builds optimized cuda versions of the kernels (requires CUDA), default is OFF

  • -DGINKGO_BUILD_DOC={ON, OFF} creates an HTML version of Ginkgo's documentation from inline comments in the code. The default is OFF.

  • -DGINKGO_DOC_GENERATE_PDF={ON, OFF} generates a PDF version of Ginkgo's documentation from inline comments in the code. The default is OFF.

  • -DGINKGO_DOC_GENERATE_DEV={ON, OFF} generates the developer version of Ginkgo's documentation. The default is OFF.

  • -DGINKGO_SET_CUDA_HOST_COMPILER={ON, OFF} instructs the build system to explicitly set CUDA's host compiler to match the compiler used to build the the rest of the library (otherwise the nvcc toolchain uses its default host compiler). Setting this option may help if you're experiencing linking errors due to ABI incompatibilities. The default is OFF.

  • -DGINKGO_EXPORT_BUILD_DIR={ON, OFF} adds the Ginkgo build directory to the CMake package registry. The default is OFF.

  • -DCMAKE_INSTALL_PREFIX=path sets the installation path for make install. The default value is usually something like /usr/local

  • -DGINKGO_VERBOSE_LEVEL=integer sets the verbosity of Ginkgo.

    • 0 disables all output in the main libraries,
    • 1 enables a few important messages related to unexpected behavior (default).
  • -DBUILD_SHARED_LIBS={ON, OFF} builds ginkgo as shared libraries (OFF) or as dynamic libraries (ON), default is ON

  • -DGINKGO_CUDA_ARCHITECTURES=<list> where <list> is a semicolon (;) separated list of architectures. Supported values are:

    • Auto
    • Kepler, Maxwell, Pascal, Volta

    Auto will automatically detect the present CUDA-enabled GPU architectures in the system. Kepler, Maxwell, Pascal and Volta will add flags for all architectures of that particular NVIDIA GPU generation. COMPUTE and CODE are placeholders that should be replaced with compute and code numbers (e.g. for compute_70 and sm_70 COMPUTE and CODE should be replaced with 70. Default is Auto. For a more detailed explanation of this option see the ARCHITECTURES specification list section in the documentation of the CudaArchitectureSelector CMake module.

For example, to build everything (in debug mode), use:

mkdir build; cd build
cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Debug -DGINKGO_DEVEL_TOOLS=ON \

NOTE: Currently, the only verified CMake generator is Unix Makefiles. Other generators may work, but are not officially supported.

Running the unit tests

You need to compile ginkgo with -DGINKGO_BUILD_TESTS=ON option to be able to run the tests. Use the following command inside the build folder to run all tests:

make test

The output should contain several lines of the form:

     Start  1: path/to/test
 1/13 Test  #1: path/to/test .............................   Passed    0.01 sec

To run only a specific test and see more details results (e.g. if a test failed) run the following from the build folder:


where path/to/test is the path returned by make test.

Running the benchmarks

In addition to the unit tests designed to verify correctness, Ginkgo also includes a benchmark suite for checking its performance on the system. To compile the benchmarks, the flag -DGINKGO_BUILD_BENCHMARKS=ON has to be set during the cmake step. In addition, the ssget command-line utility has to be installed on the system.

The benchmark suite tests Ginkgo's performance using the SuiteSparse matrix collection and artificially generated matrices. The suite sparse collection will be downloaded automatically when the benchmarks are run. Please note that the entire collection requires roughly 100GB of disk storage in its compressed format, and roughly 25GB of additional disk space for intermediate data (such us uncompressing the archive). Additionally, the benchmark runs usually take a long time (SpMV benchmarks on the complete collection take roughly 24h using the K20 GPU), and will stress the system.

The benchmark suite is invoked using the make benchmark command in the build directory. The behavior of the suite can be modified using environment variables. Assuming the bash shell is used, these can either be specified via the export command to persist between multiple runs:

export VARIABLE="value"
make benchmark

or specified on the fly, on the same line as the make benchmark command:

env VARIABLE="value" ... make benchmark

Since make sets any variables passed to it as temporary environment variables, the following shorthand can also be used:

make benchmark VARIABLE="value" ...

A combination of the above approaches is also possible (e.g. it may be useful to export the SYSTEM_NAME variable, and specify the others at every benchmark run).

Supported environment variables are described in the following list:

  • BENCHMARK={spmv, solver, preconditioner} - The benchmark set to run. Default is spmv.
    • spmv - Runs the sparse matrix-vector product benchmarks on the SuiteSparse collection.
    • solver - Runs the solver benchmarks on the SuiteSparse collection. The matrix format is determined by running the spmv benchmarks first, and using the fastest format determined by that benchmark. The maximum number of iterations for the iterative solvers is set to 10,000 and the requested residual reduction factor to 1e-6.
    • preconditioner - Runs the preconditioner benchmarks on artificially generated block-diagonal matrices.
  • DRY_RUN={true, false} - If set to true, prepares the system for the benchmark runs (downloads the collections, creates the result structure, etc.) and outputs the list of commands that would normally be run, but does not run the benchmarks themselves. Default is false.
  • EXECUTOR={reference,cuda,omp} - The executor used for running the benchmarks. Default is cuda.
  • SEGMENTS=<N> - Splits the benchmark suite into <N> segments. This option is useful for running the benchmarks on an HPC system with a batch scheduler, as it enables partitioning of the benchmark suite and running it concurrently on multiple nodes of the system. If specified, SEGMENT_ID also has to be set. Default is 1.
  • SEGMENT_ID=<I> - used in combination with the SEGMENTS variable. <I> should be an integer between 1 and <N>. If specified, only the <I>-th segment of the benchmark suite will be run. Default is 1.
  • SYSTEM_NAME=<name> - the name of the system where the benchmarks are being run. This option only changes the directory where the benchmark results are stored. It can be used to avoid overwriting the benchmarks if multiple systems share the same filesystem, or when copying the results between systems. Default is unknown.

Once make benchmark completes, the results can be found in <Ginkgo build directory>/benchmark/results/${SYSTEM_NAME}/. The files are written in the JSON format, and can be analyzed using any of the data analysis tools that support JSON. Alternatively, they can be uploaded to an online repository, and analyzed using Ginkgo's free web tool Ginkgo Performance Explorer (GPE). (Make sure to change the "Performance data URL" to your repository if using GPE.)

Installing Ginkgo

To install Ginkgo into the specified folder, execute the following command in the build folder

make install

If the installation prefix (see CMAKE_INSTALL_PREFIX) is not writable for your user, e.g. when installing Ginkgo system-wide, it might be necessary to prefix the call with sudo.

After the installation, CMake can find ginkgo with find_package(Ginkgo). An example can be found in the install_test.

Note: If the installed ginkgo was built statically and with CUDA, CUDA needs to be specified as a language in order for CMake to work properly.


Refer to for details.