Stars
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
DSPy: The framework for programming—not prompting—foundation models
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Repository hosting code used to reproduce results in "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).
Example models using DeepSpeed
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Source code for Twitter's Recommendation Algorithm
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
PyTorch package for the discrete VAE used for DALL·E.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Implementation of Graph Convolutional Networks in TensorFlow
Kaggle:Quora Question Pairs, 4th/3396 (https://www.kaggle.com/c/quora-question-pairs)
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
TensorFlow code and pre-trained models for BERT
C++11 implementation of Socket.IO client
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.