-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idea for Evals: Count how many numbers are greater than or less than X #785
Labels
Idea for Eval
These issues keep track of requests for different kinds of eval PRs
Comments
12 tasks
Created a PR for this issue: #856 @andrew-openai FYI :) |
Additionally, created another PR #878. This second one is similar but slightly different than the one mentioned above. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Note: I can develop this feature - creating the issue to get some feedback before development
I don't currently have GPT4 api access, although I do have chatGPT plus. Using the GPT4 engine I have tested this idea with the following examples. I've included screenshots below as evidence of this behavior.
Example 1
input:
How many times does a number appear that is great than 0? Respond with your answer only.
3, -3, 3, 4, -4, -2, 1, -4, 5, 0, 3, 0, -4, 4, 1, 0, 1, -4, -1, 5, 0, -3, 1, 5, 3, 4, -2, 0, 5, 1, 1, 3, 2, -4, 3, 0, 5, 4, -2, 1, -3, 0, -2, -3, -5, 0, -2, 0, 1, -1
ideal:
17
response:
18
Example 1 retry in a new window
input:
How many times does a number appear that is great than 0? Respond with your answer only.
3, -3, 3, 4, -4, -2, 1, -4, 5, 0, 3, 0, -4, 4, 1, 0, 1, -4, -1, 5, 0, -3, 1, 5, 3, 4, -2, 0, 5, 1, 1, 3, 2, -4, 3, 0, 5, 4, -2, 1, -3, 0, -2, -3, -5, 0, -2, 0, 1, -1
ideal:
17
response:
19
Example 2
input:
How many times does a number appear that is great than 0? Respond with your answer only.
0, 1, -4, 3, -3, 0, -1, 1, -3, 4
ideal:
4
response:
5
Let me know what you all think. This would be my first contribution to open source - very exciting!
The text was updated successfully, but these errors were encountered: