Stars
A quick guide (especially) for trending instruction finetuning datasets
RewardBench: the first evaluation tool for reward models.
关于domain generalization,domain adaptation,causality,robutness,prompt,optimization,generative model各式各样研究的阅读笔记
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
Recipes to train reward model for RLHF.
A recipe for online RLHF and online iterative DPO.
An index of algorithms for reinforcement learning from human feedback (rlhf))
awesome papers in LLM interpretability
Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizXgXU)
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
A high-throughput and memory-efficient inference and serving engine for LLMs
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)
InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning
🦜🔗 Build context-aware reasoning applications
Collection of training data management explorations for large language models
Code and documentation to train Stanford's Alpaca models, and generate the data.
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…
Let ChatGPT teach your own chatbot in hours with a single GPU!
The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
The official repository for paper "LLMaAA: Making Large Language Models as Active Annotators"
The official Python library for the OpenAI API
The unofficial python package that returns response of Google Bard through cookie value.
A series of large language models developed by Baichuan Intelligent Technology