CoQA's implementation only predicts the last answer of each text #1231

glerzing · 2024-01-01T05:38:06Z

For CoQA, in coqa/utils.py, only the last answer of each text (i.e. the answer for the last turn_id, with all the previous questions and answers in the context window) is predicted. On the website of the authors of CoQA, they seem to consider all turn_id (see their sample prediction file where they have answers for every turn_id, and the official evaluation script where they do an average of the result for every turn_id).

Here is an excerpt from the paper (https://arxiv.org/pdf/1808.07042.pdf) :

I haven't found how it's implemented with other popular LLM evaluation frameworks, but I'm pretty sure that predicting only the answer to the last question is not what is intended by the authors of CoQA.

StellaAthena · 2024-01-01T13:44:22Z

I think it probably makes sense to support both versions of this task, but we should make it clear that the one described in the OP is the official one.

StellaAthena added bug Something isn't working. good first issue Good for newcomers labels Jan 1, 2024

glerzing mentioned this issue Mar 2, 2024

Quac Dataset #827

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CoQA's implementation only predicts the last answer of each text #1231

CoQA's implementation only predicts the last answer of each text #1231

glerzing commented Jan 1, 2024

StellaAthena commented Jan 1, 2024

CoQA's implementation only predicts the last answer of each text #1231

CoQA's implementation only predicts the last answer of each text #1231

Comments

glerzing commented Jan 1, 2024

StellaAthena commented Jan 1, 2024