halu_control

Methods for controlling hallucinations for LLM in Summarization

Benchmark settings

MODEL	Strategy	Consistency Rate	Answer Rate	Average Length
Mistral-7B-Instruct-v0.1	Greedy	93.2	100.0	93.5
Mistral-7B-Instruct-v0.1	num_beam = 10	95.3	100.0	127.7
Mistral-7B-Instruct-v0.1	Greedy + DoLA	93.7	100.0	93.6
Mistral-7B-Instruct-v0.1	Greedy + DPO(LoRA)	95.8	100.0	97.0
Mistral-7B-Instruct-v0.1	Greedy + Fava	93.7	100.0	93.3
Mistral-7B-Instruct-v0.1	DPO(LoRA) + num_beam=10	96.9	100.0	123.7
Mistral-7B-Instruct-v0.1	Best_of_N + Temperature=0.7 + n=10	99.3	100.0	89.6

Note: Prompt slightly different from the orginal HHEM benchmark, causing different numbers.

How to reproduce the experiments

Download the leaderboard dataset (https://huggingface.co/spaces/vectara/leaderboard/raw/main/src/datasets/leaderboard_dataset.csv)
Generate the model response generated.csv, see methods below
Run evaluation on the reposnse file

python -c "from leaderboard import run_eval;run_eval('generated.csv')"

Methods

Baselines

Greedy/Beam Search

Notebook: 1_decoding.ipynb

Best of N sampling

Notebook: 5_best_of_n.ipynb

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Paper Link
Notebook: 2_dola.ipynb

Fine-tuning Language Models for Factuality

Paper Link
Notebook: 3_dpo.ipynb
Training code: dpo_training.py
Note: our setting is different from the original paper, we used CNN/Dailymail+XSum+VitaminC as the source dataset and HHEM model as the reference metric for factuality.

Fine-grained Hallucination Detection and Editing For Language Models

Web Link
Notebook: 4_fava.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
DoLa @ 8787346		DoLa @ 8787346
.gitignore		.gitignore
.gitmodules		.gitmodules
1_decoding.ipynb		1_decoding.ipynb
2_dola.ipynb		2_dola.ipynb
3_dpo.ipynb		3_dpo.ipynb
4_fava.ipynb		4_fava.ipynb
5_best_of_n.ipynb		5_best_of_n.ipynb
LICENSE		LICENSE
README.md		README.md
dpo_training.py		dpo_training.py
envs.py		envs.py
leaderboard.py		leaderboard.py
mistral_template.jinja		mistral_template.jinja
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

halu_control

Benchmark settings

How to reproduce the experiments

Methods

Baselines

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Fine-tuning Language Models for Factuality

Fine-grained Hallucination Detection and Editing For Language Models

About

Releases

Packages

Languages

License

vectara/halu_control

Folders and files

Latest commit

History

Repository files navigation

halu_control

Benchmark settings

How to reproduce the experiments

Methods

Baselines

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Fine-tuning Language Models for Factuality

Fine-grained Hallucination Detection and Editing For Language Models

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages