Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to use BLEU with multiple references? #1125

Open
juliafalcao opened this issue Dec 14, 2023 · 2 comments
Open

Is it possible to use BLEU with multiple references? #1125

juliafalcao opened this issue Dec 14, 2023 · 2 comments

Comments

@juliafalcao
Copy link

I'm creating a new task and I would like to evaluate my generated output against N different references with BLEU, but the code appears to only pick up the first available reference, and I'm not sure how to map the doc_to_target in the task YAML to include multiple refs.

@lintangsutawika
Copy link
Contributor

I'm assuming if you want to use the BLEU metric, then you would want to use the generate_until task type. In that case, you could also use the HF's implementation of BLEU.

For doc_to_target we support it having more than 1 answer so you could make it that the dataset used has a gold feature that stores a list of references for each sample.

@haileyschoelkopf
Copy link
Contributor

TriviaQA is one example of a dataset that uses multiple references! https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/triviaqa/default.yaml Please let us know if you have trouble mapping this onto BLEU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants