- New York, NY
- @vgoklani_ai
Block or Report
Block or report vgoklani
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
Utilities intended for use with Llama models.
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
Large Language Model Text Generation Inference
Distillation version of llama-68m only for MLsys research use.
Custom data types and layouts for training and inference
Collection of the latest, greatest, deep learning optimizers (for Pytorch) - CNN, NLP suitable
Port of Andrej Karpathy's nanoGPT to Apple MLX framework.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
llama3.cuda is a pure C/CUDA implementation for Llama 3 model.
A native PyTorch Library for large model training
Deep Reinforcement Learning: Zero to Hero!
Extract Deflate64 ZIP archives with Python's `zipfile` API.
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
Applied AI experiments and examples for PyTorch
A Native-PyTorch Library for LLM Fine-tuning
The easiest way to run the fastest MLX-based LLMs locally