Starred repositories
Train transformer language models with reinforcement learning.
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
DSPy: The framework for programming—not prompting—foundation models
Reference implementation for DPO (Direct Preference Optimization)
bdok23 / puppersim
Forked from jietan/puppersimSimulation for DJI Pupper v2 robot