EleutherAI / lm-evaluation-harness Public

Notifications You must be signed in to change notification settings
Fork 1.5k
Star 5.6k

Code
Issues 205
Pull requests 70
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: EleutherAI/lm-evaluation-harness

[Discussion] Add Major Code Benchmarks

#1157 opened Dec 18, 2023 by haileyschoelkopf

Open 4

Labels 10 Milestones 1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear current search query, filters, and sorts

33 Open 23 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Check compatibility of local-completions with VLLM (returns logits) for multiple_choice tasks bug

Something isn't working.

#1949 opened Jun 11, 2024 by haileyschoelkopf updated Jun 11, 2024

Add MMLU-Pro Dataset feature request

A feature that isn't implemented yet.

good first issue

Good for newcomers

help wanted

Contributors and extra help welcome.

#1947 opened Jun 11, 2024 by haileyschoelkopf updated Jun 11, 2024

Add Regression Testing feature request

A feature that isn't implemented yet.

good first issue

Good for newcomers

help wanted

Contributors and extra help welcome.

#1883 opened May 24, 2024 by haileyschoelkopf updated May 31, 2024

[Discussion] Add Major Code Benchmarks opinions wanted

For discussing open questions.

#1157 opened Dec 18, 2023 by haileyschoelkopf updated May 28, 2024

6 tasks

Add a docs FAQ section documentation

Improvements or additions to documentation.

#1676 opened Apr 5, 2024 by haileyschoelkopf updated May 27, 2024

Add New Lambada Translations good first issue

Good for newcomers

#1501 opened Feb 29, 2024 by haileyschoelkopf updated May 27, 2024

Allow Task objects to defer dataset download feature request

A feature that isn't implemented yet.

good first issue

Good for newcomers

help wanted

Contributors and extra help welcome.

#1558 opened Mar 11, 2024 by haileyschoelkopf updated May 22, 2024

Allow --include_path to import an externally-defined LM subclass feature request

A feature that isn't implemented yet.

good first issue

Good for newcomers

help wanted

Contributors and extra help welcome.

#1457 opened Feb 22, 2024 by haileyschoelkopf updated May 15, 2024

[Discussion/Feedback] VLM + Multimodal benchmarking opinions wanted

For discussing open questions.

#1155 opened Dec 18, 2023 by haileyschoelkopf updated May 13, 2024

Add More Tests feature request

A feature that isn't implemented yet.

#1827 opened May 12, 2024 by haileyschoelkopf updated May 12, 2024

New Task Request: LegalBench feature request

A feature that isn't implemented yet.

good first issue

Good for newcomers

help wanted

Contributors and extra help welcome.

#1754 opened Apr 26, 2024 by haileyschoelkopf updated Apr 26, 2024

[New Task] CommonsenseQA feature request

A feature that isn't implemented yet.

good first issue

Good for newcomers

help wanted

Contributors and extra help welcome.

#1026 opened Nov 27, 2023 by haileyschoelkopf updated Apr 17, 2024

Better Document Data-Parallel interface / clean it up feature request

A feature that isn't implemented yet.

#1684 opened Apr 7, 2024 by haileyschoelkopf updated Apr 7, 2024

Cleanup Dependencies Further feature request

A feature that isn't implemented yet.

#1683 opened Apr 7, 2024 by haileyschoelkopf updated Apr 7, 2024

Add better test coverage for models feature request

A feature that isn't implemented yet.

good first issue

Good for newcomers

help wanted

Contributors and extra help welcome.

#1613 opened Mar 20, 2024 by haileyschoelkopf updated Apr 7, 2024

Add docstring for HFLM's many keyword args documentation

Improvements or additions to documentation.

feature request

A feature that isn't implemented yet.

good first issue

Good for newcomers

help wanted

Contributors and extra help welcome.

#1682 opened Apr 7, 2024 by haileyschoelkopf updated Apr 7, 2024

Add nemo LM class to table of supported models / libraries documentation

Improvements or additions to documentation.

#1681 opened Apr 7, 2024 by haileyschoelkopf updated Apr 7, 2024

Add alternate (configurable) launcher / orchestration + sweep functionality

#1622 opened Mar 22, 2024 by haileyschoelkopf updated Mar 22, 2024

Make managing task variants / subversions easier feature request

A feature that isn't implemented yet.

#1602 opened Mar 18, 2024 by haileyschoelkopf updated Mar 18, 2024

Add task variants replicating Llama 1 / 2 evaluation numbers feature request

A feature that isn't implemented yet.

#1078 opened Dec 7, 2023 by haileyschoelkopf updated Mar 16, 2024

Make Adding New MCQA Metrics Easier feature request

A feature that isn't implemented yet.

#1585 opened Mar 15, 2024 by haileyschoelkopf updated Mar 15, 2024

Expose Configuration Options for Perplexity calculations feature request

A feature that isn't implemented yet.

#1565 opened Mar 12, 2024 by haileyschoelkopf updated Mar 12, 2024

Refactor main evaluate() loop into more readable sub-functions documentation

Improvements or additions to documentation.

feature request

A feature that isn't implemented yet.

#1100 opened Dec 11, 2023 by haileyschoelkopf updated Feb 6, 2024

Speed up + streamline prompt template rendering runtime feature request

A feature that isn't implemented yet.

help wanted

Contributors and extra help welcome.

#1286 opened Jan 15, 2024 by haileyschoelkopf updated Jan 29, 2024

Organize / Cleanup Logging + Levels documentation

Improvements or additions to documentation.

feature request

A feature that isn't implemented yet.

#1192 opened Dec 21, 2023 by haileyschoelkopf updated Jan 5, 2024

Previous 1 2 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly