-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Issues: EleutherAI/lm-evaluation-harness
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Run Gemma LM in Huggingface (simple patch)
feature request
A feature that isn't implemented yet.
#1455
by haileyschoelkopf
was closed Feb 26, 2024
2 tasks
Display A feature that isn't implemented yet.
n-shot
better for groups and for hardcoded-fewshot tasks
feature request
#1360
by haileyschoelkopf
was closed Feb 1, 2024
[New Task] Upstream remaining Okapi multilingual tasks
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1244
by haileyschoelkopf
was closed Feb 21, 2024
Add Logits to OpenAI ChatCompletions model
declined
A proposed dataset or feature request that will not be implemented.
feature request
A feature that isn't implemented yet.
help wanted
Contributors and extra help welcome.
#1196
by haileyschoelkopf
was closed May 23, 2024
Supplied Something isn't working.
process_docs()
function is not applied to fewshot docs
bug
#1179
by haileyschoelkopf
was closed Jan 12, 2024
[New Task] Paloma Eval Suite
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1176
by haileyschoelkopf
was closed Jun 19, 2024
Unify "metric" and "aggregation" abstractions
feature request
A feature that isn't implemented yet.
#1158
by haileyschoelkopf
was closed Mar 15, 2024
Print "higher_is_better" in results table
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1153
by haileyschoelkopf
was closed Jun 3, 2024
Add A feature that isn't implemented yet.
help wanted
Contributors and extra help welcome.
--predict_only
mode (run without scoring outputs)
feature request
#1152
by haileyschoelkopf
was closed Jan 31, 2024
Support wrapping prompts with a given Chat Template
feature request
A feature that isn't implemented yet.
help wanted
Contributors and extra help welcome.
opinions wanted
For discussing open questions.
Upstream Mamba integration
feature request
A feature that isn't implemented yet.
#1085
by haileyschoelkopf
was closed Dec 22, 2023
[New Task] SIQA
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1027
by haileyschoelkopf
was closed Nov 28, 2023
[New Task Request] IFEval / Instruction-Following Eval
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1012
by haileyschoelkopf
was closed Feb 9, 2024
[New Task] Implement GPQA dataset
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1010
by haileyschoelkopf
was closed Mar 5, 2024
[Refactor] Incorporate A feature that isn't implemented yet.
version
field for tasks into metadata
feature request
[Refactor] Allow for some tasks to force zero-shot
feature request
A feature that isn't implemented yet.
[Refactor] Revamp Testing / CI pipeline
feature request
A feature that isn't implemented yet.
help wanted
Contributors and extra help welcome.
#656
by haileyschoelkopf
was closed Mar 4, 2024
2 of 6 tasks
Add Anthropic Model-written Eval datasets to harness?
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#375
by haileyschoelkopf
was closed Nov 8, 2023
2 tasks
ProTip!
Follow long discussions with comments:>50.