Stars
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
LLM Finetuning with peft
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Run Mixtral-8x7B models in Colab or consumer desktops
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
[ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"
[ICML 2024] Selecting High-Quality Data for Training Language Models
[ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"
What would you do with 1000 H100s...
Official implementation of the paper Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Source code for ACL 2021 paper "CLEVE: Contrastive Pre-training for Event Extraction"
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
Source code for the TMLR paper "Black-Box Prompt Learning for Pre-trained Language Models"
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs
Modeling, training, eval, and inference code for OLMo
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
[TMLR 2024] Efficient Large Language Models: A Survey
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
Lion and Adam optimization comparison
A Data Streaming Library for Efficient Neural Network Training
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
GradAttack is a Python library for easy evaluation of privacy risks in public gradients in Federated Learning, as well as corresponding mitigation strategies.