AIRL: AI-aligned Reinforcement Learning and Dialogue Generation

This project aims to create an AI model capable of emulating Socrates' character from Plato's works using AI-aligned Reinforcement Learning (AIRL) and pre-trained generative models. We combine Low Rank Adaptation (LoRA) and Proximal Policy Optimization (PPO) with GPT-Neo-1.3B to generate realistic dialogue in virtual environments with predefined characters. The PPO-trained LoRA model generates more concise and coherent responses, but further training and refinements are needed for improved alignment with Socrates' character.

Installation and Usage

Training

Download the files in ./TrainingData/ and upload them to your Google Colab runtime.
Input your OpenAI API token in the appropriate cell.
Run the training cells as instructed in the code.

Testing

Run the install and import dependency cells.
Provide your OpenAI API token in the appropriate cell.
Run the testing cells as instructed in the code.

Methodology

Our approach consists of the following steps:

Fine-tuning: Fine-tune a LoRA of GPT-Neo-1.3B on a manually cleaned dataset of Plato's dialogues featuring Socrates. Custom callbacks are implemented for version control and LoRA model saving.
Pipeline Evaluator: Construct a pipeline evaluator using OpenAI's gpt-3.5-turbo chat model as a reward function. This model evaluates responses and generates positive or negative sentiment after reasoning. A sentiment classification model translates these outputs into scalar values of 1.0, 0.0, or -1.0.
Reinforcement Learning: Employ the TRL library to train a new LoRA based on a hybrid dataset, comprising both pre-generated synthetic responses and online results from the pipeline evaluator. Various strategies are explored, including single and multiple evaluators, as well as different evaluation criteria.
Model Adaptation: The LoRA method enables efficient switching of attention layers depending on the desired character generation, allowing us to effectively store the writing style or "personality" of Socrates in a compact 13MB format.

Testing the Model

The testing phase involves generating responses for a set of predefined prompts using the trained model. A custom stopping criteria is used in the generator pipeline to ensure that the generated text remains coherent and concise.

Future Work

Further refinements and training are needed to improve alignment with Socrates' character. This project serves as a foundation for the development of advanced conversational AI systems using reinforcement learning techniques.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
TrainingData		TrainingData
AIRL-BardiaShahrestani-YuyangCao.pdf		AIRL-BardiaShahrestani-YuyangCao.pdf
LICENSE		LICENSE
README.md		README.md
RL_AIRL_Project_BardiaShahrestani_YuyangCao.ipynb		RL_AIRL_Project_BardiaShahrestani_YuyangCao.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AIRL: AI-aligned Reinforcement Learning and Dialogue Generation

Installation and Usage

Training

Testing

Methodology

Testing the Model

Future Work

About

Releases

Packages

Languages

License

Bardia323/AIRL-Socrates

Folders and files

Latest commit

History

Repository files navigation

AIRL: AI-aligned Reinforcement Learning and Dialogue Generation

Installation and Usage

Training

Testing

Methodology

Testing the Model

Future Work

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages