Stars
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Making large AI models cheaper, faster and more accessible
[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs
The hub for EleutherAI's work on interpretability and learning dynamics
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".
Doing simple retrieval from LLM models at various context lengths to measure accuracy
[ACL 2024] LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
EleutherAI / DeeperSpeed
Forked from microsoft/DeepSpeedDeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
A MAD laboratory to improve AI architecture designs 🧪
CoreNet: A library for training deep neural networks
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Understand and test language model architectures on synthetic tasks.
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
A framework for few-shot evaluation of language models.
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…
Offical implementation of "Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring the Design of Next-generation Neuromorphic Chips" (ICLR2024)
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Offical implementation of "Spike-driven Transformer" (NeurIPS2023)
Implementation of "SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks"
SpikingJelly is an open-source deep learning framework for Spiking Neural Network (SNN) based on PyTorch.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Datasets, Transforms and Models specific to Computer Vision