-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Issues: EleutherAI/lm-evaluation-harness
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Add Anthropic Model-written Eval datasets to harness?
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#375
by haileyschoelkopf
was closed Nov 8, 2023
2 tasks
[Refactor] Incorporate A feature that isn't implemented yet.
version
field for tasks into metadata
feature request
[Refactor] Allow for some tasks to force zero-shot
feature request
A feature that isn't implemented yet.
Supplied Something isn't working.
process_docs()
function is not applied to fewshot docs
bug
#1179
by haileyschoelkopf
was closed Jan 12, 2024
[New Task] Implement GPQA dataset
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1010
by haileyschoelkopf
was closed Mar 5, 2024
Run Gemma LM in Huggingface (simple patch)
feature request
A feature that isn't implemented yet.
#1455
by haileyschoelkopf
was closed Feb 26, 2024
2 tasks
Add A feature that isn't implemented yet.
help wanted
Contributors and extra help welcome.
--predict_only
mode (run without scoring outputs)
feature request
#1152
by haileyschoelkopf
was closed Jan 31, 2024
Display A feature that isn't implemented yet.
n-shot
better for groups and for hardcoded-fewshot tasks
feature request
#1360
by haileyschoelkopf
was closed Feb 1, 2024
Upstream Mamba integration
feature request
A feature that isn't implemented yet.
#1085
by haileyschoelkopf
was closed Dec 22, 2023
Print "higher_is_better" in results table
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1153
by haileyschoelkopf
was closed Jun 3, 2024
[New Task] Paloma Eval Suite
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1176
by haileyschoelkopf
was closed Jun 19, 2024
ProTip!
Adding no:label will show everything without a label.