This code defines a memory-aligned 196,883 x 196,833 matrix of complex numbers on Linux (577.6 GiB):
real = static_cast<double*>(aligned_alloc(64, 196883 * 196883 * sizeof(double)));
imag = static_cast<double*>(aligned_alloc(64, 196883 * 196883 * sizeof(double)));
The matrix represents a linear transformation associated with an element of the monster group in its 196,883-dimensional representation. This code currently generates two of these random matrices and multiplies them using an array of CPU optimizations.
- OS: Ubuntu Server 22.04 LTS - Gen2, Windows 10/11/Server
- Architecture: x64 (w/ optional AVX-512)
- Memory: >1,734 GiB
- Cores: max available
sudo apt update && sudo apt install g++ make -y
git clone https://github.com/MattMcL4475/MonsterGroup.git
cd MonsterGroup/MonsterGroup
make
./monster
git clone https://github.com/MattMcL4475/MonsterGroup.git
cd MonsterGroup
"C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\Bin\MSBuild.exe" MonsterGroup.sln /p:Configuration=Release
.\Release\monster.exe
- Memory Alignment
64-byte aligned memory allocations for real and imaginary arrays optimized for Intel® AVX-512. - SIMD Vectorization (with AVX-512)
Processes eight doubles at a time using AVX-512 instructions in the inner loop, including aligned data load/store and fused multiply-add (FMA) operations - Multithreading
Usesstd::thread
to maximize concurrency on all available hyperthreads. - Loop Unrolling
In the AVX-512 code section, the k-loop is unrolled in groups of four to maximize parallelism. - Prefetching
Includes pre-fetch hints so the CPU pre-fetches data into the cache before it's accessed. - Cache Blocking / Tiling (in non-AVX-512 code)
The non-AVX-512 code organizes loops into blocks or "tiles" to fit the working set of data into the CPU cache. The three nested loops are divided into blocks of sizetileSize
, which is dynamically calculated based on the cache line size of the hardware. - Conditional platform compilation
Platform-specific implementations for both Linux and Windows.
"It's got too many properties for it to be an accident. It's never been explained why it's even there" -John Horton Conway FRS (26 December 1937 – 11 April 2020)
Monster group: https://en.wikipedia.org/wiki/Monster_group
Monsterous moonshine: https://en.wikipedia.org/wiki/Monstrous_moonshine