openai / evals Public

Notifications You must be signed in to change notification settings
Fork 2.5k
Star 14.3k

Code
Issues 84
Pull requests 36
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: openai/evals

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear current search query, filters, and sorts

84 Open 117 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Safety Eval Idea: allergen information of different food products in the Israeli market.

#874 opened Apr 30, 2023 by ido777

Mandarin homophones eval 🇨🇳

#689 opened Apr 15, 2023 by ofou

Taxonomy to use for model evaluation?

#706 opened Apr 17, 2023 by qrdlgit

Where to find the experiment comparation: Using the data of training reward model for fine-tuning without reinforcement learning.

#711 opened Apr 18, 2023 by guotong1988

Update documentation regarding custom evals

#715 opened Apr 18, 2023 by PreciousWarrior

fix error

#765 opened Apr 23, 2023 by Huymoixode

Idea for Evals: Sorting numbers with repeats and negatives Idea for Eval

These issues keep track of requests for different kinds of eval PRs

#782 opened Apr 23, 2023 by voynow

Idea for Evals: Count how many numbers are greater than or less than X Idea for Eval

These issues keep track of requests for different kinds of eval PRs

#785 opened Apr 23, 2023 by voynow

docs out of date bug

Something isn't working

#929 opened May 7, 2023 by rustam-e

Please add an option to change language or understand other languages

#795 opened Apr 24, 2023 by 0XxMuzanxX0

Idea for eval - Bible knowledge as tested in the yearly International Bible youth Contest

#840 opened Apr 27, 2023 by ido777

Idea for Evals: improve abstract logic abilities

#848 opened Apr 27, 2023 by karinageneraly

Are not merged PRs the result of irrelevancy to the model?

#873 opened Apr 30, 2023 by albukirky1

Suggestion: add git-hook hook for pre-commit

#384 opened Mar 21, 2023 by machinekoder

Inaccuracy of AI-generated responses bug

Something isn't working

#906 opened May 3, 2023 by vmn2014

Feature suggestion - Save and Load Conversation History Idea for Eval

These issues keep track of requests for different kinds of eval PRs

#907 opened May 3, 2023 by wawryszukd

pip install evals throws AssertionError bug

Something isn't working

#918 opened May 4, 2023 by CholoTook

Eval idea: Security code review for unicode attacks on code Idea for Eval

These issues keep track of requests for different kinds of eval PRs

#787 opened Apr 24, 2023 by qrdlgit

Make GPT4 aware of the evals format

#143 opened Mar 15, 2023 by bhack

Add BigBench Tasks for evaluation Idea for Eval

These issues keep track of requests for different kinds of eval PRs

#153 opened Mar 15, 2023 by Muhtasham

Evaluation on computer vision benchmarks Idea for Eval

These issues keep track of requests for different kinds of eval PRs

#235 opened Mar 16, 2023 by finitearth

Evaluate GPT-4 on classical NLP tasks Idea for Eval

These issues keep track of requests for different kinds of eval PRs

#246 opened Mar 16, 2023 by LifeIsStrange

Update Autoflake to Ruff for faster pre-commit

#347 opened Mar 19, 2023 by imjuanleonard

Windows path and unicode decoding

#379 opened Mar 21, 2023 by ulasdilek

Create an evaluation that measures a model's ability to remember specifics about texts in it's dataset? Idea for Eval

These issues keep track of requests for different kinds of eval PRs

#383 opened Mar 21, 2023 by mrconter1

Previous 1 2 3 4 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly