Starred repositories
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
A fast communication-overlapping library for tensor parallelism on GPUs.
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation me…
a state-of-the-art-level open visual language model | 多模态预训练模型
Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".
A work in progress. Trying to write about all interesting or necessary pieces in the current development of LLMs and generative AI. Gradually adding more topics.
A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer
Fast and memory-efficient exact attention
A high-throughput and memory-efficient inference and serving engine for LLMs
Transformer related optimization, including BERT, GPT
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
MLNLP: This repository is a collection of AI top conferences papers (e.g. ACL, EMNLP, NAACL, COLING, AAAI, IJCAI, ICLR, NeurIPS, and ICML) with open resource code
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.