-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add EQ-Bench #1459
Labels
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
Comments
We'd welcome a contribution for EQ-Bench! |
haileyschoelkopf
added
help wanted
Contributors and extra help welcome.
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
labels
Feb 23, 2024
I can have a look at doing this if no one is already |
I will be happy to test it :) |
haileyschoelkopf
pushed a commit
that referenced
this issue
Mar 6, 2024
wx-zhang
pushed a commit
to wx-zhang/lm-evaluation-harness
that referenced
this issue
Mar 13, 2024
* Start adding eq-bench * Start adding to yaml and utils * Get metric working * Add README * Handle cases where answer is not parseable * Deal with unparseable answers and add percent_parseable metric * Update README
nightingal3
pushed a commit
to mycoalchen/lm-evaluation-harness
that referenced
this issue
May 2, 2024
* Start adding eq-bench * Start adding to yaml and utils * Get metric working * Add README * Handle cases where answer is not parseable * Deal with unparseable answers and add percent_parseable metric * Update README
djstrong
pushed a commit
to speakleash/lm-evaluation-harness
that referenced
this issue
Aug 2, 2024
* Start adding eq-bench * Start adding to yaml and utils * Get metric working * Add README * Handle cases where answer is not parseable * Deal with unparseable answers and add percent_parseable metric * Update README
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
EQ-Bench (https://github.com/EQ-bench/EQ-Bench) is more and more popular. I think it should be possible to implement, do you agree?
The text was updated successfully, but these errors were encountered: