CHAIR

CHAIR metric is a rule-based metric for evaluating object hallucination in caption generation.

I modified the original code, the original implementation CHAIR can be found at https://github.com/LisaAnne/Hallucination/blob/master/utils/chair.py

I did NOT changing its calculation to keep consistency.

Espeically, i added a new metric Recall to calculate the percentage of recalled gt objects over all gt objects. Other modifications are listed below.

Modifications

adapt calculation of CHAIR-i and CHAIR-s for Python3, supports for both json and jsonl file input.
integrate synonyms.txt to make the script standalone.
remove machine-translation based metrics BLEU-n, CIDEr, ROGUE
add new metric Recall, which represents the node words(i.e. lemmas of objects) coverage overall.
add pickle cache mechanism to make it fast for repetitive evaluations.

Usage

Install Dependencies

pattern
nltk
tqdm

(Optional) Build Annotations

I aleady serilized the inited CHAIR evaluator object for coco into a pickle, you can use it by setting --cache, see #Example Run.

Or if you want to built CHAIR evaluator from scratch, do following steps.

Put these files into coco_annotations dir first. Download from https://images.cocodataset.org/annotations/annotations_trainval2014.zip

captions_train2014.json
captions_val2014.json
instances_train2014.json
instances_val2014.json

and python chair.py --cache <new_cache_path>, it should argue about inputs but it is okay to built cache.

Example Run

python chair.py \
    --cap_file example_inputs.jsonl \
    --image_id_key image_id \
    --caption_key caption \
    --cache chair.pkl \
    --save_path outputs.json

outputs:

CHAIRs    : 0.0
CHAIRi    : 0.0
Recall    : 85.7

Explanations of Args

parser.add_argument("--cap_file", type=str, default='',
                    help="path towards json or jsonl saving image ids and their captions in list of dict.")
parser.add_argument("--image_id_key", type=str, default="image_id",
                    help="in each dict of cap_file, which key stores image id of coco.")
parser.add_argument("--caption_key", type=str, default="caption",
                    help="in each dict of cap_file, which key stores caption of the image.")

parser.add_argument("--cache", type=str, default="chair.pkl",
                    help="pre inited CHAIR evaluator object, for fast loading.")
parser.add_argument("--coco_path", type=str, default='coco_annotations',
                    help="only use for regenerating CHAIR evaluator object, will be ignored if uses cached evaluator.")

parser.add_argument("--save_path", type=str, default="",
                    help="saving CHAIR evaluate and results to json, useful for debugging the caption model.")

Disclaimer

Since the original implementation is written in Python2, hard to read and needs intermediate results to run, i have not test the original implementation yet.

But i've tried my best to keep the consistency so i guess it could reproduce CHAIR comparing to the original code.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
chair.pkl		chair.pkl
chair.py		chair.py
example_inputs.jsonl		example_inputs.jsonl
outputs.json		outputs.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CHAIR

Modifications

Usage

Install Dependencies

(Optional) Build Annotations

Example Run

Explanations of Args

Disclaimer

About

Releases

Packages

Languages

Maxlinn/CHAIR-metric-standalone

Folders and files

Latest commit

History

Repository files navigation

CHAIR

Modifications

Usage

Install Dependencies

(Optional) Build Annotations

Example Run

Explanations of Args

Disclaimer

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages