Starred repositories
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
llama3 implementation one matrix multiplication at a time
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Fast and memory-efficient exact attention
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
An easy-to-use library for skin tone classification
Utilities intended for use with Llama models.
Firefly: 大模型训练工具,支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning、全参微调等
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
The official Python client for the Huggingface Hub.
YOLOv10: Real-Time End-to-End Object Detection
The official GitHub page for the survey paper "A Survey of Large Language Models".
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models