Block or Report
Block or report QingqingSun-Bao
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
General technology for enabling AI capabilities w/ LLMs and MLLMs
[ACL 2024] Progressive LLaMA with Block Expansion.
Official github repo for AutoDetect, an automated weakness detection framework for LLMs.
Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension
🩺 首个会看胸部X光片的中文多模态医学大模型 | The first Chinese Medical Multimodal Model that Chest Radiographs Summarization.
Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
CMMLU: Measuring massive multitask language understanding in Chinese
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
Alpaca dataset from Stanford, cleaned and curated
microsoft / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Ongoing research training transformer models at scale
FastAPI framework, high performance, easy to learn, fast to code, ready for production
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Official code for ICLR 2022 paper: "PoNet: Pooling Network for Efficient Token Mixing in Long Sequences".
EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit
A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models
This repo contains our ACL 2017 paper data and source code
A wide variety of research projects developed by the SpokenNLP team of Speech Lab, Alibaba Group.
This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.
Zstandard - Fast real-time compression algorithm
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.