Skip to content

Issues: EleutherAI/lm-evaluation-harness

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

New Evaluation: Legal feature request A feature that isn't implemented yet. good first issue Good for newcomers
#76 by StellaAthena was closed Nov 21, 2022
2 tasks
Implement the SAT evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#27 by StellaAthena was closed Jan 8, 2021
2 tasks done
Implement the English Grammar Correction evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#28 by StellaAthena was closed Nov 21, 2022
2 tasks
Implement the News Article Generation evaluation feature request A feature that isn't implemented yet.
#29 by StellaAthena was closed Nov 21, 2022
2 tasks
Implement the Novel Word evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#30 by StellaAthena was closed Nov 21, 2022
2 tasks
Support richer example-packing functionality. feature request A feature that isn't implemented yet.
#31 by zphang was closed Jan 4, 2021
Support writing out predictions feature request A feature that isn't implemented yet.
#32 by zphang was closed Jan 4, 2021
Double check all of the zero/few-shot formats documentation Improvements or additions to documentation.
#34 by leogao2 was closed Jan 4, 2021
Implement the Penn Tree Bank evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#5 by StellaAthena was closed Mar 25, 2023
RACE: nlp -> datasets bug Something isn't working.
#44 by cfoster0 was closed Oct 22, 2020
Implement Semantic Search evaluation feature request A feature that isn't implemented yet.
#58 by cauefcr was closed Nov 21, 2022
Make the eval_harness talk to the server feature request A feature that isn't implemented yet.
#62 by StellaAthena was closed Jan 4, 2021
Implement the WSC273 Winograd Schemas Challenge evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#12 by StellaAthena was closed Feb 3, 2021
2 tasks done
New Evaluation: Math feature request A feature that isn't implemented yet. good first issue Good for newcomers
#77 by StellaAthena was closed Feb 25, 2022
2 tasks
New Evaluation: Biology feature request A feature that isn't implemented yet. good first issue Good for newcomers
#78 by StellaAthena was closed Nov 21, 2022
2 tasks
Possible Bug?: argmax in sat.py comparison bug Something isn't working.
#83 by nicholaskross was closed Jan 28, 2021
Implement all GLUE evaluations
#92 by leogao2 was closed Jan 28, 2021
Implement WIkitext for GPT-2 replication feature request A feature that isn't implemented yet. good first issue Good for newcomers
#40 by anishthite was closed Jun 12, 2021
1 of 2 tasks
Implement the LAMBADA evaluation feature request A feature that isn't implemented yet.
#6 by StellaAthena was closed Jan 29, 2021
Implement the HellaSwag evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#7 by StellaAthena was closed Feb 8, 2021
2 tasks done
Implement the StoryCloze evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#8 by StellaAthena was closed Apr 1, 2022
1 of 2 tasks
Implement the Natural Questions evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#9 by StellaAthena was closed Aug 21, 2023
1 of 2 tasks
Implement the WebQuestions evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#10 by StellaAthena was closed Feb 8, 2021
1 of 2 tasks
Implement the TriviaQA evaluation feature request A feature that isn't implemented yet. good first issue Good for newcomers
#11 by StellaAthena was closed Jan 30, 2021
2 tasks done
ProTip! What’s not been updated in a month: updated:<2024-06-07.