-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Issues: openai/evals
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Cannot pass the check and got this error: KeyError: 'sample'
bug
Something isn't working
#1012
opened May 23, 2023 by
14H034160212
Sample evaluations completing after timeout cause duplicate results
bug
Something isn't working
#1333
opened Aug 10, 2023 by
robatwilliams
Having trouble building Evals locally? Try this.
bug
Something isn't working
#1340
opened Aug 25, 2023 by
silverfoxf7
How to eval output with ideal_answer directly without having to define the completion_fn ?
#1342
opened Aug 29, 2023 by
liuyaox
In the task "balance_chemical_equation", many instances have incorrect labels.
bug
Something isn't working
#1386
opened Oct 19, 2023 by
dongZheX
Evaluation on computer vision benchmarks
Idea for Eval
These issues keep track of requests for different kinds of eval PRs
#235
opened Mar 16, 2023 by
finitearth
Idea for Evals: GPT matches lyrics with song name
Idea for Eval
These issues keep track of requests for different kinds of eval PRs
#390
opened Mar 21, 2023 by
Ein-Tim
Idea for Evals: Complex, multi-turn instruction-following Evals
Idea for Eval
These issues keep track of requests for different kinds of eval PRs
#632
opened Apr 11, 2023 by
andrew-openai
Evaluate GPT-4 on classical NLP tasks
Idea for Eval
These issues keep track of requests for different kinds of eval PRs
#246
opened Mar 16, 2023 by
LifeIsStrange
Idea for Evals: Count how many numbers are greater than or less than X
Idea for Eval
These issues keep track of requests for different kinds of eval PRs
#785
opened Apr 23, 2023 by
voynow
Feature suggestion - Save and Load Conversation History
Idea for Eval
These issues keep track of requests for different kinds of eval PRs
#907
opened May 3, 2023 by
wawryszukd
ProTip!
Find all open issues with in progress development work with linked:pr.