Inverse Scaling Tasks? #1442

RylanSchaeffer · 2024-02-18T19:33:12Z

Apologies if this has been asked before, but I couldn't find the answer in lm_evals/tasks or any issues. Are there plans to add Inverse Scaling (https://github.com/inverse-scaling/prize) into the lm-evaluation-harness?

The text was updated successfully, but these errors were encountered:

haileyschoelkopf · 2024-02-19T18:41:30Z

Hasn't been asked before!

Supporting these tasks as originally implemented would be very nice! We ourselves probably won't have the bandwidth for it soon, but if anyone wishes to contribute them we'd be happy to assist and review.

h-albert-lee · 2024-02-21T13:01:53Z

That implementation looks interesting, do you mind if I try it?

haileyschoelkopf · 2024-02-21T14:10:37Z

Yes, that'd be fantastic if you're interested!

h-albert-lee · 2024-02-22T00:14:42Z

Thank you for assigning. I'll get to work soon!

h-albert-lee · 2024-02-27T12:30:51Z

To address any possible issues, I'm currently asking the inverse scaling slack if it's okay to implement these tasks. I will start implementing them as soon as they are approved.

h-albert-lee · 2024-03-09T02:25:36Z

@RylanSchaeffer @haileyschoelkopf The initial implementation is done, all that's left is to test that it produces results like the paper. I'll make a pull request once I've verified the results.

RylanSchaeffer · 2024-03-09T02:28:04Z

Awesome :) Cheers, Rylan Schaeffer

…

On Fri, Mar 8, 2024 at 6:25 PM Hanwool Albert Lee ***@***.***> wrote: @RylanSchaeffer <https://github.com/RylanSchaeffer> @haileyschoelkopf <https://github.com/haileyschoelkopf> The initial implementation is done, all that's left is to test that it produces results like the paper. I'll make a pull request once I've verified the results. — Reply to this email directly, view it on GitHub <#1442 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACEHLCZBRHSEGU3IKRGC363YXJXLLAVCNFSM6AAAAABDOMNE6WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBWGY4TONZRGA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

h-albert-lee · 2024-03-13T03:58:37Z

@haileyschoelkopf
Hi! I compared the scores shown in the paper with the scores obtained through lm-eval-harness using OPT 125M through 6.7B models, and there is a slight difference. I was wondering if lm-eval-harness tolerates a certain amount of score difference?

h-albert-lee · 2024-03-13T08:55:13Z

Or maybe I'll use the code utilized in the evaluation(of inverse scaling prize) as a custom metric. I didn't get anything from the inverse-scaling team, but I did find some related work in the authors' github.

haileyschoelkopf · 2024-03-13T12:47:52Z

@h-albert-lee That’s great progress!

Would you be able to open a PR with your implementation and resulting scores so we can discuss there? It’s hard to say without being able to look at the implementation differences/concrete numbers.

h-albert-lee · 2024-03-13T13:13:59Z

@haileyschoelkopf Thanks a lot!, I'll apply the pre-commit and post a pull request with my experimental results soon.

haileyschoelkopf added help wanted Contributors and extra help welcome. feature request A feature that isn't implemented yet. good first issue Good for newcomers labels Feb 19, 2024

haileyschoelkopf assigned h-albert-lee Feb 21, 2024

h-albert-lee linked a pull request Mar 16, 2024 that will close this issue

#1442 inverse scaling tasks implementation #1589

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inverse Scaling Tasks? #1442

Inverse Scaling Tasks? #1442

RylanSchaeffer commented Feb 18, 2024

haileyschoelkopf commented Feb 19, 2024

h-albert-lee commented Feb 21, 2024

haileyschoelkopf commented Feb 21, 2024

h-albert-lee commented Feb 22, 2024

h-albert-lee commented Feb 27, 2024

h-albert-lee commented Mar 9, 2024

RylanSchaeffer commented Mar 9, 2024 via email

h-albert-lee commented Mar 13, 2024

h-albert-lee commented Mar 13, 2024

haileyschoelkopf commented Mar 13, 2024

h-albert-lee commented Mar 13, 2024

Inverse Scaling Tasks? #1442

Inverse Scaling Tasks? #1442

Comments

RylanSchaeffer commented Feb 18, 2024

haileyschoelkopf commented Feb 19, 2024

h-albert-lee commented Feb 21, 2024

haileyschoelkopf commented Feb 21, 2024

h-albert-lee commented Feb 22, 2024

h-albert-lee commented Feb 27, 2024

h-albert-lee commented Mar 9, 2024

RylanSchaeffer commented Mar 9, 2024 via email

h-albert-lee commented Mar 13, 2024

h-albert-lee commented Mar 13, 2024

haileyschoelkopf commented Mar 13, 2024

h-albert-lee commented Mar 13, 2024