-
Chinese Academy of Science
- Beijing
- http:https://www.yichen.ink/
Highlights
- Pro
Block or Report
Block or report Longyichen
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Official PyTorch Implementation of EMoE: Unlocking Emergent Modularity in Large Language Models [main conference @ NAACL2024]
Continual Learning of Large Language Models: A Comprehensive Survey
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
崩坏:星穹铁道脚本 | Honkai: Star Rail auto bot (简体中文/繁體中文/English/Español)
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
An awesome gpu tasks scheduler. 轻量好用的GPU机群任务调度工具。觉得有用可以点个star
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
[ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"
LaTeX Proposal Template for the University of Chinese Academy of Sciences
LaTeX Thesis Template for the University of Chinese Academy of Sciences
[TMLR 2024] Efficient Large Language Models: A Survey
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
MiniSora: A community aims to explore the implementation path and future development direction of Sora.
We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters
Notebooks for training universal 0-shot classifiers on many different tasks
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
fastHan是基于fastNLP与pytorch实现的中文自然语言处理工具,像spacy一样调用方便。
Codebase for Merging Language Models (ICML 2024)
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Codes and checkpoints of paper "Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules"
Framework for BLOOM probing
A curated list of neural network pruning resources.