-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement the DROP evaluation #19
Comments
Note: HuggingFace includes this in its datasets package. |
added in ec4d361 , ended up not using HuggingFace's implementation since they only included answer spans as labels, leaving out a ton of labels |
Also, should we prepend 'Passage: ' for each passage? It seems OA did not do this, not sure if it was intentional. |
My vote is “no” because I see no reason to do so. |
@anishthite I don't see a line for this dataset in lm_eval/tasks/init.py. Am I missing something, or is this not quite finished? |
@jon-tow How is the work on this coming? |
Add E2E NLG Cleaned, update required Transformers version
Add E2E NLG Cleaned, update required Transformers version
Add E2E NLG Cleaned, update required Transformers version
From the GPT-3 paper
The evaluation code should be modeled after the interface in
lm_eval/base.py
and the example of theBoolQ
task inlm_eval/tasks/suerglue.py
The text was updated successfully, but these errors were encountered: