- Menlo Park
Highlights
- Pro
Stars
Efficient Triton Kernels for LLM Training
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an implementation of kNN-LM and kNN-MT
Machine Learning Engineering Open Book
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
SGLang is a fast serving framework for large language models and vision language models.
Morph an input dataset of 2D points into select shapes, while preserving the summary statistics to a given number of decimal points through simulated annealing. It is intended to be used as a teach…
code for the paper "DiGress: Discrete Denoising diffusion for graph generation"
[WIP] A 🔥 interface for running code in the cloud
dpfried / apps
Forked from hendrycks/appsAPPS: Automated Programming Progress Standard (NeurIPS 2021)
C99 just in time Python embeddable compiler with no external compiler dependencies so you can seamlessly use fast/existing C code from Python
A dynamic control flow graph (CFG) reconstruction plugin for valgrind.
Train transformer language models with reinforcement learning.
Native implementation of 'Numerical Recipes in C'
A collection of modern/faster/saner alternatives to common unix commands.
Vector (and Scalar) Quantization, in Pytorch
An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols
Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with…
Parallelformers: An Efficient Model Parallelization Toolkit for Deployment