Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CoQA's implementation only predicts the last answer of each text #1231

Open
glerzing opened this issue Jan 1, 2024 · 1 comment
Open

CoQA's implementation only predicts the last answer of each text #1231

glerzing opened this issue Jan 1, 2024 · 1 comment
Labels
bug Something isn't working. good first issue Good for newcomers

Comments

@glerzing
Copy link

glerzing commented Jan 1, 2024

For CoQA, in coqa/utils.py, only the last answer of each text (i.e. the answer for the last turn_id, with all the previous questions and answers in the context window) is predicted. On the website of the authors of CoQA, they seem to consider all turn_id (see their sample prediction file where they have answers for every turn_id, and the official evaluation script where they do an average of the result for every turn_id).

Here is an excerpt from the paper (https://arxiv.org/pdf/1808.07042.pdf) :

image

I haven't found how it's implemented with other popular LLM evaluation frameworks, but I'm pretty sure that predicting only the answer to the last question is not what is intended by the authors of CoQA.

@StellaAthena StellaAthena added bug Something isn't working. good first issue Good for newcomers labels Jan 1, 2024
@StellaAthena
Copy link
Member

I think it probably makes sense to support both versions of this task, but we should make it clear that the one described in the OP is the official one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working. good first issue Good for newcomers
Projects
Status: Ready
Development

No branches or pull requests

2 participants