Starred repositories
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
how to learn PyTorch and OneFlow
how to optimize some algorithm in cuda.
《Machine Learning Systems: Design and Implementation》- Chinese Version
Curated list of resources on testing distributed systems
Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023
Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimentation and parallelization, and has demonstrated industry lead…
Named Tensors for Legible Deep Learning in JAX
Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…
A simple, performant and scalable Jax LLM!
Two implementations of ZeRO-1 optimizer sharding in JAX
JAX - A curated list of resources https://github.com/google/jax
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes
Model parallel transformers in JAX and Haiku
A high-throughput and memory-efficient inference and serving engine for LLMs
Simple, light-weight and easy-to-use asynchronous components
A course to build distributed key-value service based on TiKV model
Alluxio, data orchestration for analytics and machine learning in the cloud