-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Issues: EleutherAI/lm-evaluation-harness
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Support for top-k metrics with num_return_sequences>1
feature request
A feature that isn't implemented yet.
#1117
opened Dec 13, 2023 by
yuyemin
[New Task] COLLIE
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1013
opened Nov 21, 2023 by
haileyschoelkopf
The tokenizer add_special_tokens parameter for t5 model lambada task
#1017
opened Nov 22, 2023 by
daisyden
[New Task] CommonsenseQA
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1026
opened Nov 27, 2023 by
haileyschoelkopf
A new DROP benchmark is needed
opinions wanted
For discussing open questions.
#1050
opened Nov 30, 2023 by
StellaAthena
assert len(continuation_enc) error in _loglikelihood_tokens for certain (but not all) tasks?
#1053
opened Dec 2, 2023 by
lhl
FileNotFoundError: Couldn't find a module script at exact_match.py. Module 'exact_match' doesn't exist on the Hugging Face Hub either.
bug
Something isn't working.
#1071
opened Dec 6, 2023 by
xinghuang2050
Implement the SuperGLUE evaluation
feature request
A feature that isn't implemented yet.
#22
opened Sep 16, 2020 by
StellaAthena
1 of 2 tasks
Add ZeroScrolls Benchmark
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1083
opened Dec 8, 2023 by
haileyschoelkopf
Verify Stopsequences Don't Impact Scores
validation
For validation of task implementations.
#1086
opened Dec 9, 2023 by
haileyschoelkopf
Async support for OpenAI ChatCompletions
feature request
A feature that isn't implemented yet.
#1095
opened Dec 11, 2023 by
haileyschoelkopf
Refactor main Improvements or additions to documentation.
feature request
A feature that isn't implemented yet.
evaluate()
loop into more readable sub-functions
documentation
#1100
opened Dec 11, 2023 by
haileyschoelkopf
Upstream Llemma Math Task Suite
feature request
A feature that isn't implemented yet.
#1151
opened Dec 18, 2023 by
haileyschoelkopf
[Discussion/Feedback] VLM + Multimodal benchmarking
opinions wanted
For discussing open questions.
#1155
opened Dec 18, 2023 by
haileyschoelkopf
[Discussion] Add Major Code Benchmarks
opinions wanted
For discussing open questions.
#1157
opened Dec 18, 2023 by
haileyschoelkopf
6 tasks
Add task variants replicating Llama 1 / 2 evaluation numbers
feature request
A feature that isn't implemented yet.
#1078
opened Dec 7, 2023 by
haileyschoelkopf
Implement XQuAD
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#191
opened Jun 10, 2021 by
sdtblck
Implement MLQA
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#192
opened Jun 10, 2021 by
sdtblck
Implement TyDiQA
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#193
opened Jun 10, 2021 by
sdtblck
MNLI task giving (very) different results than the HuggingFace task accuracy metric
bug
Something isn't working.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#320
opened May 8, 2022 by
JunShern
"Please select a token to use as
pad_token
" error for alpaca-lora-7b
model
#434
opened Apr 24, 2023 by
oshev
Add TheoremQA
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#518
opened May 23, 2023 by
StellaAthena
ProTip!
Add no:assignee to see everything that’s not assigned.