Stars
OneDiff: An out-of-the-box acceleration library for diffusion models.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Test suite for probing the numerical behavior of NVIDIA tensor cores
A high-throughput and memory-efficient inference and serving engine for LLMs
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Optimized primitives for collective multi-GPU communication
cppreference.com html archive converter to microsoft help (for Visual Studio 2012+) and chm help (for Windows)
A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture
First-Class GPU Resource Management: Device Drivers, Runtimes, and CUDA Compilers for Nouveau.
Python bindings for FFmpeg - with complex filtering support
This repository is a home to Intel® Deep Learning Streamer (Intel® DL Streamer) Pipeline Framework. Pipeline Framework is a streaming media analytics framework, based on GStreamer* multimedia frame…
A library for efficient similarity search and clustering of dense vectors.
Khronos Vulkan, OpenGL, and OpenGL ES Conformance Tests
filecoin-project / fil-ocl
Forked from cogciprocate/oclOpenCL for Rust
Rust tools for OpenCL and GPU management.
filecoin-project / bellperson
Forked from zkcrypto/bellmanzk-SNARK library