This repository contains the code for LoRA+, introduced in LoRA+: Efficient Low Rank Adaptation of Large Models.
LoRA+ introduces one new required hyperparameter to your optimizer (and another optional hyperparameter).
-
loraplus_lr_ratio
: the ratio of learning rates$\eta_B / \eta_A$ where$\eta_A$ is the optimizer learning rate. Set this to a value greater than 1 for best performance. See the paper for more information. -
loraplus_lr_embedding
: (optional) if LoRA modules are added to embedding layers your can specify an different learning rate for them. Default value1e-6
.
To integrate LoRA+ into a finetuning project using huggingface Trainer
is straightforward. Just replace the Trainer
in your project with LoraPlusTrainer
in loraplus.py
and pass in the LoRA+ arguments above to its TrainingArguments
.
For training with LoRA+ without Trainer
you can just use an optimizer created with the create_loraplus_optimizer
function in loraplus.py
. This function wraps an optimizer class and sets the learning rates of your model parameters appropriately for the optimizer.
import torch
# LoRA model
model = ...
optimizer_cls = torch.optim.AdamW
optimizer_kwargs = {'lr': 5e-5, 'eps': 1e-6, 'betas': (0.9, 0.999), 'weight_decay': 0.0}
loraplus_lr_ratio = 20.0
optimizer = _create_optimizer(model, optimizer_cls, optimizer_kwargs, loraplus_ratio):
In the glue/
folder we have code for finetuning models on GLUE using LoRA+.
To run the code, first install the requirements using:
pip install -r requirements.txt
Download the glue tasks using e.g. :
python download_glue.py --task_names mnli,qqp --datadir data/
Check scripts/
folder for finetuning examples with gpt2, roberta-base, and llama-7b.
This code is part of the LoRA+ project. For citations, please use
@article{hayou2024loraplus,
title={LoRA+: Efficient Low Rank Adaptation of Large Models},
author={Soufiane Hayou and Nikhil Ghosh and Bin Yu},
year={2024},
journal={arXiv 2402.12354}
}