Skip to content

Issues: EleutherAI/lm-evaluation-harness

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Assignee
Filter by who’s assigned
Sort

Issues list

Implement the DROP evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#19 by StellaAthena was closed Mar 7, 2021
1 of 2 tasks
Implement the HellaSwag evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#7 by StellaAthena was closed Feb 8, 2021
2 tasks done
Implement the StoryCloze evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#8 by StellaAthena was closed Apr 1, 2022
1 of 2 tasks
Implement the Natural Questions evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#9 by StellaAthena was closed Aug 21, 2023
1 of 2 tasks
Implement the WebQuestions evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#10 by StellaAthena was closed Feb 8, 2021
1 of 2 tasks
Implement the TriviaQA evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#11 by StellaAthena was closed Jan 30, 2021
2 tasks done
Implement the WSC273 Winograd Schemas Challenge evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#12 by StellaAthena was closed Feb 3, 2021
2 tasks done
Implement the adversarially-mined Winogrande evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#13 by StellaAthena was closed Feb 3, 2021
2 tasks done
Implement the PhysicalQA evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#14 by StellaAthena was closed Feb 3, 2021
1 of 2 tasks
Implement the ARC Challenge evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#15 by StellaAthena was closed Feb 5, 2021
2 tasks done
Implement the OpenBookQA evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#16 by StellaAthena was closed Feb 9, 2021
2 tasks done
Implement the CoQA evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#17 by StellaAthena was closed Feb 14, 2021
1 of 2 tasks
Implement the Penn Tree Bank evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#5 by StellaAthena was closed Mar 25, 2023
Implement the QuAC evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#18 by StellaAthena was closed Nov 14, 2023
1 of 2 tasks
Implement the SQuAD evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#20 by StellaAthena was closed Mar 28, 2021
1 of 2 tasks
Implement the RACE evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#21 by StellaAthena was closed Jan 30, 2021
2 tasks done
Implement the Natural Language Inference (NLI) evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#23 by StellaAthena was closed Feb 12, 2021
1 of 2 tasks
Implement the Adversarial Natural Language Inference (ANLI) evaluation feature request A feature that isn't implemented yet.
#24 by StellaAthena was closed Jan 30, 2021
1 of 2 tasks
Implement arithmetic evaluations feature request A feature that isn't implemented yet. good first issue Good for newcomers
#25 by StellaAthena was closed Jan 28, 2021
2 tasks done
Implement the symbolic manipulations evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#26 by StellaAthena was closed Feb 26, 2021
2 tasks done
Implement the SAT evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#27 by StellaAthena was closed Jan 8, 2021
2 tasks done
Implement the English Grammar Correction evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#28 by StellaAthena was closed Nov 21, 2022
2 tasks
Implement the News Article Generation evaluation feature request A feature that isn't implemented yet.
#29 by StellaAthena was closed Nov 21, 2022
2 tasks
Implement the Novel Word evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#30 by StellaAthena was closed Nov 21, 2022
2 tasks
mmlu evaluation fail
#2005 by jxiw was closed Jun 21, 2024
ProTip! Find all open issues with in progress development work with linked:pr.