Block or Report
Block or report rrkarim
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Code for QuaRot, an end-to-end 4-bit inference of large language models.
A Native-PyTorch Library for LLM Fine-tuning
Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians
Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry, Materials Science and Biology
An open source, standard data file format for graph data storage and retrieval.
pix2tex: Using a ViT to convert images of equations into LaTeX code.
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
MSCCL++: A GPU-driven communication stack for scalable AI applications
Fault-tolerant, highly scalable GPU orchestration, and a machine learning framework designed for training models with billions to trillions of parameters
A high-throughput and memory-efficient inference and serving engine for LLMs
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
(Asyncio OR Threadsafe) Google Cloud Client Library for Python
Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.
An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more
A permissively licensed C and C++ Task Scheduler for creating parallel programs. Requires C++11 support.
Author's implementation of SIGGRAPH 2023 paper, "A Practical Walk-on-Boundary Method for Boundary Value Problems."
VRS is a file format optimized to record & playback streams of sensor data, such as images, audio samples, and any other discrete sensors (IMU, temperature, etc), stored in per-device streams of ti…
3D Gaussian Splatting, reimagined: Unleashing unmatched speed with C++ and CUDA from the ground up!
An open benchmarking platform for medical artificial intelligence using Federated Evaluation.
A scalable inference server for models optimized with OpenVINO™