Stars
Steering Llama 2 with Contrastive Activation Addition
The nnsight package enables interpreting and manipulating the internals of deep learned models.
Code for reproducing our paper "Not All Language Model Features Are Linear"
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Training Sparse Autoencoders on Language Models
Using sparse coding to find distributed representations used by neural networks.
Sparse probing paper full code.
Accessible large language models via k-bit quantization for PyTorch.
Convenience functions for working with pytorch hooks.
Solve puzzles. Improve your pytorch.
Interpreting how transformers simulate agents performing RL tasks
Graph-based LLM power tool for exploring many completions in parallel.
LlamaIndex is a data framework for your LLM applications
High-speed download of LLaMA, Facebook's 65B parameter GPT model
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
You like pytorch? You like micrograd? You love tinygrad! ❤️
Mechanistic Interpretability for Transformer Models
Model parallel transformers in JAX and Haiku
A library for mechanistic interpretability of GPT-style language models
A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations
Running large language models on a single GPU for throughput-oriented scenarios.
Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline