Skip to content
View JL-Cheng's full-sized avatar
😀
May reply slowly.
😀
May reply slowly.
  • Tsinghua University
  • Beijing,China

Block or report JL-Cheng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Tools for merging pretrained large language models.

Python 4,640 420 Updated Oct 10, 2024
Jupyter Notebook 88 6 Updated Sep 30, 2024

Deep learning for dummies. All the practical details and useful utilities that go into working with real models.

Python 688 35 Updated Sep 24, 2024

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 716 83 Updated Oct 12, 2024

OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training

Python 251 18 Updated Sep 26, 2024

Lightning Training strategy for HiveMind

Python 16 5 Updated Oct 8, 2024

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

Python 2,007 162 Updated Jul 16, 2024

Torch Distributed Experimental

Python 116 31 Updated Aug 5, 2024
Python 1,607 139 Updated Sep 27, 2024

The official Meta Llama 3 GitHub site

Python 26,606 3,013 Updated Aug 12, 2024
Python 210 15 Updated Apr 10, 2024

DSIR large-scale data selection framework for language model training

Python 223 19 Updated Apr 7, 2024

[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"

Python 342 18 Updated Aug 19, 2024

This is a list of peer-reviewed representative papers on deep learning dynamics (optimization dynamics of neural networks). The success of deep learning attributes to both network architecture and …

243 23 Updated Apr 10, 2024
Python 7,103 549 Updated Aug 12, 2024

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

Python 462 27 Updated Aug 15, 2024

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,090 843 Updated Jul 1, 2024

Making large AI models cheaper, faster and more accessible

Python 38,724 4,341 Updated Oct 11, 2024

leaked prompts of GPTs

28,501 3,851 Updated Sep 27, 2024
Python 8 Updated Aug 21, 2023

LLaMa Tuning with Stanford Alpaca Dataset using Deepspeed and Transformers

Python 51 7 Updated Mar 15, 2023

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com

Python 3,307 400 Updated Oct 8, 2024

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

Python 14,105 2,113 Updated Jul 23, 2024

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!

Python 2,680 168 Updated Oct 12, 2024
Python 936 130 Updated Oct 10, 2024

Instruct-tune LLaMA on consumer hardware

Jupyter Notebook 18,583 2,214 Updated Jul 29, 2024

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook 1,469 234 Updated Oct 9, 2024

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation me…

Python 1,214 111 Updated Apr 3, 2024
Next