Skip to content

Supervised Finetune and Align LLMs using RLHF (RM and PPO).

Notifications You must be signed in to change notification settings

lightmatmul/RLHF-trainer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RLHF Training for LLMs

This repository contains implementations for Reinforcement Learning with Human Feedback (RLHF) training of Large Language Models (LLMs) using Supervised Fine-Tuning (SFT), Reward Modeling, and Proximal Policy Optimization (PPO). The goal is to create a modular and maintainable codebase for replicating RLHF training on LLMs like LLaMA. The following codebase is specific to LLaMa 2, so while the components can work universally, data related components (such as special token formatting) need to be modified to fit other models.

Table of Contents

Installation

  1. Clone the repository:

    git clone https://github.com/lightmatmul/rlhf_training.git
    cd rlhf_training
  2. Create and activate a virtual environment:

    python -m venv env
    source env/bin/activate  # On Windows, use `env\Scripts\activate
  3. Install the required packages:

    pip install -r requirements.txt

Usage

Supervised Fine-Tuning

To train a model using Supervised Fine-Tuning (SFT), run the following script:

python scripts/train_sft.py

Reward Modeling

To train a reward model, run the following script:

python scripts/train_reward.py

Proximal Policy Optimization

To train a model using Proximal Policy Optimization (PPO), run the following script:

python scripts/train_ppo.py

Configuration

The configuration files are located in the configs/ directory. Here’s a brief description of each:

  1. lora_config.py: Contains the configuration for LoRA (Low-Rank Adaptation).
  2. reward_config.py: Contains the constants and configurations specific to Reward Modeling.
  3. ppo_config.py: Contains the constants and configurations specific to PPO.

Evaluation

GPT is used as AI evaluator to determine evaluate the impact of the alignment tuning compared to the original supervised finetuned model:

python eval/gpt_evaluator.py
python eval/count_wins.py

Inference

To interact with the trained models, run the following scriptL:

python scripts/inference.py

About

Supervised Finetune and Align LLMs using RLHF (RM and PPO).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages