-
Tsinghua University
- Beijing,China
Stars
Tools for merging pretrained large language models.
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training
Lightning Training strategy for HiveMind
Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.
DSIR large-scale data selection framework for language model training
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
This is a list of peer-reviewed representative papers on deep learning dynamics (optimization dynamics of neural networks). The success of deep learning attributes to both network architecture and …
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Making large AI models cheaper, faster and more accessible
LLaMa Tuning with Stanford Alpaca Dataset using Deepspeed and Transformers
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
Instruct-tune LLaMA on consumer hardware
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation me…