Skip to content
View niyunsheng's full-sized avatar
🤡
Talk is cheap.Show me the code.
🤡
Talk is cheap.Show me the code.

Block or report niyunsheng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

Showing results

The official GitHub page for the survey paper "A Survey of Large Language Models".

Python 10,115 797 Updated Aug 20, 2024

Ongoing research training transformer models at scale

Python 10,148 2,282 Updated Oct 1, 2024

arxiv preprint: https://arxiv.org/abs/2405.07542

C++ 5 Updated Jul 29, 2024

Implementation of Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting

Python 41 5 Updated Jun 26, 2024

📰 Must-read papers and blogs on Speculative Decoding ⚡️

374 14 Updated Sep 26, 2024

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)

Python 166 16 Updated May 29, 2024

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 8,125 1,459 Updated Oct 2, 2024

scalable and robust tree-based speculative decoding algorithm

Python 305 31 Updated Aug 13, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,312 931 Updated Oct 1, 2024

[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”

Python 114 6 Updated Jul 8, 2024

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,561 507 Updated Sep 26, 2024

maximal update parametrization (µP)

Jupyter Notebook 1,364 93 Updated Jul 17, 2024

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)

Python 780 80 Updated Sep 27, 2024

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,111 65 Updated Feb 14, 2024

An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"

Python 1,160 100 Updated Oct 22, 2023

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

Python 2,495 447 Updated Sep 30, 2024

General technology for enabling AI capabilities w/ LLMs and MLLMs

Python 3,591 273 Updated Sep 30, 2024

Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**

Jupyter Notebook 131 8 Updated May 24, 2024

REST: Retrieval-Based Speculative Decoding, NAACL 2024

C 163 10 Updated Sep 25, 2024

Simple Dynamic Batching Inference

Python 145 17 Updated Mar 8, 2022

Supercharge Your Model Training

Python 5,125 415 Updated Oct 1, 2024

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Python 544 42 Updated Mar 4, 2024

FC游戏模拟器(包括FC(Nes), GG, GBC)

C++ 88 38 Updated Jun 24, 2021

Official release of InternLM2.5 base and chat models. 1M context support

Python 6,289 441 Updated Sep 6, 2024

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,229 153 Updated Jun 25, 2024

Fast and memory-efficient exact attention

Python 13,612 1,246 Updated Oct 2, 2024

LLM inference in C/C++

C++ 65,707 9,432 Updated Oct 2, 2024

An easy to use PyTorch to TensorRT converter

Python 4,575 675 Updated Aug 17, 2024

Large Language Model Text Generation Inference

Python 8,848 1,036 Updated Oct 2, 2024

浙江大学课程攻略共享计划

HTML 37,065 9,416 Updated Sep 29, 2024
Next