Skip to content
/ FLEUR Public

[ACL 2024] FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model

Notifications You must be signed in to change notification settings

Yebin46/FLEUR

Repository files navigation

FLEUR

FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model

Getting Started

FLEUR utilizes the LLaVA model for performing image caption evaluation (though you may use other Vision Language Models if desired). Please follow the instructions in the LLaVA GitHub README for the necessary setup. No additional training is required.

Evaluation on Flickr8k-Expert dataset

  • Running code for FLEUR:
CUDA_VISIBLE_DEVICES=0,1 python fleur.py
  • Running code for RefFLEUR:
CUDA_VISIBLE_DEVICES=0,1 python reffleur.py

Or get the explanation together

CUDA_VISIBLE_DEVICES=0,1 python fleur_exp.py

The evaluation result will be saved as txt files in the results folder.

Compute Kendall's Tau Correlation

Change file names of annotation file and the evaluation result file in compute_correlation.py

python compute_correlation.py

About

[ACL 2024] FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages