Skip to content
View vgoklani's full-sized avatar
Block or Report

Block or report vgoklani

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Utilities intended for use with Llama models.

Python 2,934 408 Updated Jul 31, 2024

Fast Matrix Multiplications for Lookup Table-Quantized LLMs

Cuda 103 3 Updated Jul 27, 2024

Large Language Model Text Generation Inference

Python 8,501 971 Updated Jul 31, 2024

Distillation version of llama-68m only for MLsys research use.

Python 1 Updated Apr 25, 2024

Custom data types and layouts for training and inference

Python 449 57 Updated Jul 31, 2024

Collection of the latest, greatest, deep learning optimizers (for Pytorch) - CNN, NLP suitable

Jupyter Notebook 211 41 Updated Apr 4, 2021

Port of Andrej Karpathy's nanoGPT to Apple MLX framework.

Python 91 8 Updated Feb 12, 2024

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Python 19,092 2,895 Updated Jul 25, 2024

LLM101n: Let's build a Storyteller

26,183 1,401 Updated Jul 29, 2024
Python 107 13 Updated Jul 23, 2024

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python 634 49 Updated Jul 24, 2024

Prune transformer layers

Python 57 9 Updated May 30, 2024

llama3.cuda is a pure C/CUDA implementation for Llama 3 model.

Cuda 276 18 Updated Jun 4, 2024

A native PyTorch Library for large model training

Python 1,380 125 Updated Jul 31, 2024
Python 3 Updated Apr 1, 2024

Deep Reinforcement Learning: Zero to Hero!

Jupyter Notebook 1,976 69 Updated Jul 6, 2024

Extract Deflate64 ZIP archives with Python's `zipfile` API.

Python 16 6 Updated Dec 5, 2023
Python 66 76 Updated Jun 20, 2024

The official Meta Llama 3 GitHub site

Python 24,990 2,733 Updated Jul 28, 2024

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

Python 425 18 Updated Jul 22, 2024
Python 474 37 Updated Jul 29, 2024

LLM training in simple, raw C/CUDA

Cuda 22,399 2,484 Updated Jul 30, 2024

An introduction to programming in Triton

Python 1 Updated Jan 2, 2024
Python 67 17 Updated Jun 18, 2024

Applied AI experiments and examples for PyTorch

Python 93 7 Updated Jul 2, 2024
Python 451 38 Updated Apr 1, 2024

A Native-PyTorch Library for LLM Fine-tuning

Python 3,694 311 Updated Jul 31, 2024

The easiest way to run the fastest MLX-based LLMs locally

Swift 195 13 Updated Jul 10, 2024
Python 31 Updated Jan 1, 2024
Next