Stars
Language
Sort by: Recently starred
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
LLM training code for Databricks foundation models
Implementation of "SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks"
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…
General technology for enabling AI capabilities w/ LLMs and MLLMs
🦜🔗 Build context-aware reasoning applications
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Code and documentation to train Stanford's Alpaca models, and generate the data.
Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI
Visualizer for neural network, deep learning and machine learning models
Library for reading and writing large multi-dimensional arrays.
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
PerfKit Benchmarker (PKB) contains a set of benchmarks to measure and compare cloud offerings. The benchmarks use default settings to reflect what most users will see. PerfKit Benchmarker is licens…
Provides a set of benchmarks that can be used to measure the memory bandwidth performance of CPU's
Large batch training of CTR models based on DeepCTR with CowClip.
A tensor-aware point-to-point communication primitive for machine learning
An industrial deep learning framework for high-dimension sparse data
Development repository for the Triton language and compiler