Skip to content
View LinHR000's full-sized avatar

Block or report LinHR000

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation

Python 44 2 Updated Jul 11, 2024

A fast communication-overlapping library for tensor parallelism on GPUs.

C++ 196 13 Updated Sep 18, 2024

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 573 45 Updated Sep 4, 2024

The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

Python 361 26 Updated Jul 9, 2024

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++ 7,897 406 Updated Sep 6, 2024

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,110 64 Updated Feb 14, 2024

Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation me…

Python 1,212 111 Updated Apr 3, 2024

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 5,904 405 Updated May 29, 2024

Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

Python 700 91 Updated Aug 20, 2024

A work in progress. Trying to write about all interesting or necessary pieces in the current development of LLMs and generative AI. Gradually adding more topics.

Jupyter Notebook 184 10 Updated Sep 14, 2023
Python 56 15 Updated Sep 26, 2024

A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer

C++ 84 21 Updated Feb 28, 2024

Fast and memory-efficient exact attention

Python 13,566 1,243 Updated Sep 28, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 27,526 4,050 Updated Sep 28, 2024

Transformer related optimization, including BERT, GPT

C++ 5,798 886 Updated Mar 27, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 34,898 4,056 Updated Sep 27, 2024

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 64,332 7,962 Updated Sep 23, 2024

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Python 167,001 44,175 Updated Sep 28, 2024

Inference code for Llama models

Python 55,693 9,498 Updated Aug 18, 2024

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Python 7,673 668 Updated Jan 14, 2024

MLNLP: This repository is a collection of AI top conferences papers (e.g. ACL, EMNLP, NAACL, COLING, AAAI, IJCAI, ICLR, NeurIPS, and ICML) with open resource code

2,584 600 Updated Oct 18, 2022

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Python 22,561 3,615 Updated Jul 28, 2024

基于Python的开源量化交易平台开发框架

Python 24,546 8,608 Updated Sep 15, 2024

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 132,671 26,437 Updated Sep 27, 2024