Starred repositories
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips
Enhance Your English Writing for Science Research 写论文英语素材
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
TryOnDiffusion: A Tale of Two UNets Implementation
搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)
Repository hosting code used to reproduce results in "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
High-Resolution Image Synthesis with Latent Diffusion Models
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
Code for paper: DivideMix: Learning with Noisy Labels as Semi-supervised Learning
程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).
EasyTransfer is designed to make the development of transfer learning in NLP applications easier.
Source code for AAAI 2019 paper "Hyperbolic Heterogeneous Information Network Embedding"
An elegant PyTorch deep reinforcement learning library.
A curated list of awesome self-supervised methods
计算广告/推荐系统/机器学习(Machine Learning)/点击率(CTR)/转化率(CVR)预估/点击率预估
Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)
A curated list of visual relationship detection and related area resources
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".
Deep Reinforcement Learning Lab, a platform designed to make DRL technology and fun for everyone
Rethinking the Value of Network Pruning (Pytorch) (ICLR 2019)
A curated list of image captioning and related area resources. :-)
Sharing the CTR Prediction original paper and personal study notes
This is the code for our ACL 2019 paper "Open Domain Event Extraction Using Neural Latent Variable Models"