Validation Perplexities #38

shoaibahmed · 2022-12-21T14:18:04Z

Thanks for sharing this amazing work. This will hopefully help in developing a better understanding of how LLMs work.

I had one question. Are the validation perplexities for each of the models available (ideally with every model snapshot) so that we can compare models on equal footing?

haileyschoelkopf · 2022-12-21T15:19:33Z

Hi! We're actively working on gathering all evals and getting them posted in this repo, including LAMBADA perplexity. If you're wondering about validation perplexity on the Pile, unfortunately in order to save compute we did not run on the validation set during training very frequently at all.

We plan to have LAMBADA perplexity up for all models on this repo for 15 evenly spaced checkpoints (steps 3000, 13000,..., 133000, 143000) asap! Would you want or need more granular results for any experiments?

shoaibahmed · 2022-12-21T15:30:54Z

Hi! Thank you for your kind response.

We plan to have LAMBADA perplexity up for all models on this repo for 15 evenly spaced checkpoints (steps 3000, 13000,..., 133000, 143000) asap! Would you want or need more granular results for any experiments?

That's great! Thank you for the awesome work. I was particularly using the aforementioned checkpoints, so having the Lambada perplexity (which should be proportional to the validation perplexity on PILE) should help.

haileyschoelkopf · 2022-12-21T23:30:57Z

fantastic, we should have those up and all corrected within a week ideally!

haileyschoelkopf · 2022-12-31T03:21:01Z

Hi @shoaibahmed ! I believe all models' evaluations should now be up-to-date.

Let me know if any evals look suspect or anything's missing! Hopefully all is well though, barring some cleanup of filenames (also, all lambada tasks are the lambada_openai task in the Eleuther lm-evaluation-harness.)

Also let me know if having more granular evals/PPL would be helpful to your research for any reason :)

shoaibahmed · 2022-12-31T10:18:33Z

That's really awesome. Thank you for your prompt action on this. I will go ahead and close this issue now.

I will open a new issue in case something else comes up which might be beneficial. Thanks once again for your time and effort.

haileyschoelkopf mentioned this issue Dec 30, 2022

Finalize 0 and 5-shot evals #46

Merged

6 tasks

shoaibahmed closed this as completed Dec 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation Perplexities #38

Validation Perplexities #38

shoaibahmed commented Dec 21, 2022

haileyschoelkopf commented Dec 21, 2022

shoaibahmed commented Dec 21, 2022

haileyschoelkopf commented Dec 21, 2022

haileyschoelkopf commented Dec 31, 2022

shoaibahmed commented Dec 31, 2022

Validation Perplexities #38

Validation Perplexities #38

Comments

shoaibahmed commented Dec 21, 2022

haileyschoelkopf commented Dec 21, 2022

shoaibahmed commented Dec 21, 2022

haileyschoelkopf commented Dec 21, 2022

haileyschoelkopf commented Dec 31, 2022

shoaibahmed commented Dec 31, 2022