-
Tencent
- Beijing, China
Stars
Recipes to train reward model for RLHF.
A fast inference library for running LLMs locally on modern consumer-class GPUs
how to optimize some algorithm in cuda.
Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!
A high-throughput and memory-efficient inference and serving engine for LLMs
A large-scale, fine-grained, diverse preference dataset (and models).
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search…
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
《开源大模型食用指南》基于Linux环境快速部署开源大模型,更适合中国宝宝的部署教程
Code for the paper Fine-Tuning Language Models from Human Preferences
A library for efficient similarity search and clustering of dense vectors.
🦜🔗 Build context-aware reasoning applications
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3
Official code of Remote Sensing Mamba
sunkx109 / llama
Forked from meta-llama/llamaInference code for LLaMA models
Python code for handling the Clotho dataset.
Source code for the paper 'Audio Captioning Transformer'
semantic segmentation pytorch 语义分割