Phased Instruction Fine-Tuning for Large Language Models

This repository provides an overview of all components from the paper Phased Instruction Fine-Tuning for Large Language Models, ACL 2024 Findings.

Citation

@Inproceedings{PhasedSFT,
    author = {Wei Pang and Chuan Zhou and Xiao-Hua Zhou and Xiaojie Wang},
    title = {Phased Instruction Fine-Tuning for Large Language Models},
    booktitle = {ACL Findings},
    year = {2024},
    pages = {},
}

Code

bash

bash run.sh
bash stopall.sh

code dir

###1.generation dir: making inference on 'oasst' 'anthropic' 'koala' 'vicuna' 'sinstruct' 'wizardlm'

bash evaluation.sh

###2.evaluation dir: scoring with gpt-4-0613 and then calculating the Win-Rate metric

bash run_gpt4_scoring.sh
bash run_win_rate.sh

###3.scripts dir: training scripts

###4.xllm dir: traing codes and dataloader

Datasets

The instruction difficulty within the Alpaca and Alpaca-cleaned data are quantitatively assessed by GPT-4, assigning scores from 1 to 5, with higher score denoting increased complexity.

difficulty-stratified instruction dataset

Alpaca-scored: Alpaca 52k dataset scored by the strongest gpt-4-0613, then splited into three stages with difficult increasing.
Alpaca-clean-scored: Alpaca-clean 52k dataset scored by gpt-4-0613 too.

Summary of this paper

The above Figure is a summary of this paper. As can be seen from the figure, with the progression of uptraining, it demonstrates the winning rate growth trend （the solid line） of five LLMs on multi-stage sub-datasets with increasing difficulty. This forms a stark contrast to the winning rate trend of the same five LLMs on multi-stage sub-datasets with randomly distributed difficulty levels （the dotted-line）.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phased Instruction Fine-Tuning for Large Language Models

Citation

Code

bash

code dir

Datasets

difficulty-stratified instruction dataset

Summary of this paper

Alpaca 52K scored by gpt-4-0613

Alpaca-clean 52K scored by gpt-4-0613

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Alpaca-clean-scored		Alpaca-clean-scored
Alpaca-scored		Alpaca-scored
evaluation		evaluation
generation		generation
scripts		scripts
xllm		xllm
LICENSE		LICENSE
README.md		README.md
run.sh		run.sh
stopall.sh		stopall.sh

License

xubuvd/PhasedSFT

Folders and files

Latest commit

History

Repository files navigation

Phased Instruction Fine-Tuning for Large Language Models

Citation

Code

bash

code dir

Datasets

difficulty-stratified instruction dataset

Summary of this paper

Alpaca 52K scored by gpt-4-0613

Alpaca-clean 52K scored by gpt-4-0613

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages