Skip to content

A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

License

Notifications You must be signed in to change notification settings

dojoteef/FActScore

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FActScore

made-with-python PyPI version factscore Downloads

This is the official release accompanying our preprint, "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation". FActScore is available as a PIP package as well.

Install

python3.7 -m virtualenv fs-venv
pip install factscore
python -m spacy download en_core_web_sm

Download the data

python -m factscore.download_data

Or, download it manually from this Google Drive link. Make a cache directory .cache/factscore, and place unzipped demos and enwiki-20230401.db in that directory.

Running the script with oracle atomic facts

python -m factscore.factscorer --data_path {data_path} --model_name {estimator_name} --cache_dir {cache_dir} --openai_key {openai_key}
  • data_path can be something like data/src-light/bio_ChatGPT_v0.2.jsonl which is in a format we have been using so far. TODO for simplying the format and allowing it to take any topics/generations.
  • model_name: retrieval+llama, retrieval+llama+npm, retrieval+ChatGPT, retrieval+ChatGPT+npm
  • cache_dir: .cache/factscore by default.
  • openai_key: File containing API Key, only needed when ChatGPT is being used.

For example,

python -m factscore.factscorer \
    --data_path original_generation/v0.2/answers_mpt-7b_bio_test_addtional.jsonl \
    --model_name "retrieval+ChatGPT" \
    --cache_dir ".cache/factscore" \
    --openai_key "api.key"

It uses enwiki-20230401 by default, and will download the database from our Google drive. It also uses Inst-LLAMA, downloading from the Google Drive. TODO: need to release diff from LLAMA 7B only. Also need to allow users to specify their own LM path if they want to use a different LM.

To use a custom knowledge source.

You need a .jsonl file where each line is a dictionary containing title and text. text can either be a string or a list of strings (e.g., sections).

from factscore.factscorer import FactScorer

fs = FactScorer()

# this will create a database using your file
# for English Wikipedia (18GB)), it takes ~8 hours
# once DB file is created, you can reuse it by only specifying `db_path`
fs.register_knowledge_source(name_of_your_knowledge_source,
                             data_path=path_to_jsonl_file,
                             db_path=path_to_output_db_file)

# now, when you compute a score, specify knowledge source to use
score = fs.get_score(topics, generations, knowledge_source=name_of_your_knowledge_source)

About

A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Python 100.0%