Highlights
- Pro
Block or Report
Block or report Erland366
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
TORAX: Tokamak transport simulation in JAX
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Experimental q[X]ora kernel development code
Scalable toolkit for efficient model alignment
BM25S is an ultra-fast lexical search library that implements BM25 using scipy
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization and sparsity. It compresses deep learning models for downstream deployment frame…
Proof-of-concept of global switching between numpy/jax/pytorch in a library.
MambaOut: Do We Really Need Mamba for Vision?
GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"
Collective communications library with various primitives for multi-machine training.
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Neural Networks: Zero to Hero
This project is a implementation in PyTorch for ZO-AdaMU optimization: Adapting Perturbation with the Momentum and Uncertainty in Zeroth-order Optimization.
Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models l…
CUDA accelerated rasterization of gaussian splatting
Zero-effort CLI interfaces & config objects, from types
Evaluate your LLM's response with Prometheus and GPT4 💯