-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Issues: EleutherAI/lm-evaluation-harness
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
New Evaluation: Legal
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#76
by StellaAthena
was closed Nov 21, 2022
2 tasks
Implement the SAT evaluation
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#27
by StellaAthena
was closed Jan 8, 2021
2 tasks done
Implement the English Grammar Correction evaluation
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#28
by StellaAthena
was closed Nov 21, 2022
2 tasks
Implement the News Article Generation evaluation
feature request
A feature that isn't implemented yet.
#29
by StellaAthena
was closed Nov 21, 2022
2 tasks
Implement the Novel Word evaluation
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#30
by StellaAthena
was closed Nov 21, 2022
2 tasks
Support richer example-packing functionality.
feature request
A feature that isn't implemented yet.
#31
by zphang
was closed Jan 4, 2021
Support writing out predictions
feature request
A feature that isn't implemented yet.
#32
by zphang
was closed Jan 4, 2021
Double check all of the zero/few-shot formats
documentation
Improvements or additions to documentation.
#34
by leogao2
was closed Jan 4, 2021
Implement the Penn Tree Bank evaluation
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#5
by StellaAthena
was closed Mar 25, 2023
Implement Semantic Search evaluation
feature request
A feature that isn't implemented yet.
#58
by cauefcr
was closed Nov 21, 2022
Add flag to allow the evaluations to be carried out on a subset of the eval tasks
feature request
A feature that isn't implemented yet.
#60
by StellaAthena
was closed Nov 23, 2020
Make the eval_harness talk to the server
feature request
A feature that isn't implemented yet.
#62
by StellaAthena
was closed Jan 4, 2021
Implement the WSC273 Winograd Schemas Challenge evaluation
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#12
by StellaAthena
was closed Feb 3, 2021
2 tasks done
New Evaluation: Math
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#77
by StellaAthena
was closed Feb 25, 2022
2 tasks
New Evaluation: Biology
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#78
by StellaAthena
was closed Nov 21, 2022
2 tasks
Possible Bug?: argmax in sat.py comparison
bug
Something isn't working.
#83
by nicholaskross
was closed Jan 28, 2021
Implement WIkitext for GPT-2 replication
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#40
by anishthite
was closed Jun 12, 2021
1 of 2 tasks
Implement the LAMBADA evaluation
feature request
A feature that isn't implemented yet.
#6
by StellaAthena
was closed Jan 29, 2021
Implement the HellaSwag evaluation
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#7
by StellaAthena
was closed Feb 8, 2021
2 tasks done
Implement the StoryCloze evaluation
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#8
by StellaAthena
was closed Apr 1, 2022
1 of 2 tasks
Implement the Natural Questions evaluation
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#9
by StellaAthena
was closed Aug 21, 2023
1 of 2 tasks
Implement the WebQuestions evaluation
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#10
by StellaAthena
was closed Feb 8, 2021
1 of 2 tasks
Implement the TriviaQA evaluation
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#11
by StellaAthena
was closed Jan 30, 2021
2 tasks done
ProTip!
What’s not been updated in a month: updated:<2024-06-07.