Stars
DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models
[NeurIPS 2024] Can LLMs Learn by Teaching? A Preliminary Study
DQNSuite is a revolutionary tool that brings the power of Reinforcement Learning models into the palm of the user's hand.
Create an open source toy dataset for finetuning LLMs with reasoning abilities
Keeping my personal experiments separate from the main repo
Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.