-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Issues: EleutherAI/lm-evaluation-harness
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Implement MLQA
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#192
opened Jun 10, 2021 by
sdtblck
Implement TyDiQA
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#193
opened Jun 10, 2021 by
sdtblck
MNLI task giving (very) different results than the HuggingFace task accuracy metric
bug
Something isn't working.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#320
opened May 8, 2022 by
JunShern
"Please select a token to use as
pad_token
" error for alpaca-lora-7b
model
#434
opened Apr 24, 2023 by
oshev
Add TheoremQA
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#518
opened May 23, 2023 by
StellaAthena
Pile tasks on big-refactor use dataset_names from old dataset loader that don't exist on HF
bug
Something isn't working.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#731
opened Aug 3, 2023 by
yeoedward
Quac Dataset
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
#827
opened Sep 4, 2023 by
RanchiZhao
Implement the SuperGLUE evaluation
feature request
A feature that isn't implemented yet.
#22
opened Sep 16, 2020 by
StellaAthena
1 of 2 tasks
TGI support - API evaluation of HF models
feature request
A feature that isn't implemented yet.
help wanted
Contributors and extra help welcome.
#869
opened Sep 19, 2023 by
ManuelFay
"RuntimeError: CUDA out of memory" on lm-eval 0.3.0 through GPT-NeoX evaluate past a certain number of nodes
bug
Something isn't working.
duplicate
This issue or pull request already exists.
help wanted
Contributors and extra help welcome.
#884
opened Sep 23, 2023 by
AIproj
toxigen task measures toxicity classification rather than whether generations are toxic?
#974
opened Nov 8, 2023 by
laphang
Stability Upstream translated task
feature request
A feature that isn't implemented yet.
#1006
opened Nov 20, 2023 by
StellaAthena
The tokenizer add_special_tokens parameter for t5 model lambada task
#1017
opened Nov 22, 2023 by
daisyden
[New Task] CommonsenseQA
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1026
opened Nov 27, 2023 by
haileyschoelkopf
Should num_fewshot be type list?
feature request
A feature that isn't implemented yet.
#837
opened Sep 6, 2023 by
Wehzie
Previous Next
ProTip!
Follow long discussions with comments:>50.