Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Would it be possible to share training loss curves on the original Pythia models? #145

Closed
itsnamgyu opened this issue Jan 16, 2024 · 4 comments

Comments

@itsnamgyu
Copy link

No description provided.

@StellaAthena
Copy link
Member

We are working on collecting them. We didn't have WandB configured in a way that makes than easy unfortunately, and are struggling to clean it up after the fact.

If you're okay with only having the ones for small models, @oskarvanderwal is retraining some of them with different random seeds and (I think) has better logging set up.

@itsnamgyu
Copy link
Author

Thanks. @oskarvanderwal would it be possible to share the pre-training loss curves of smaler Pythia models?

@oskarvanderwal
Copy link

oskarvanderwal commented Jan 22, 2024

Hi @itsnamgyu, we are actually collecting all the loss curves for the smaller Pythia models (14m, 31m, 70m, 160m, 410m) for different seeds, and we'll share them on the Pythia github once finished. Note: these are for the non-deduped training corpus.

In the mean time, you can find some of these curves here (not the original ones from the paper): https://wandb.ai/eleutherai/pythia-extra-seeds/reports/Some-loss-curves-for-smaller-Pythia-models--Vmlldzo2NTkxNDIw

Be aware that if we had to stop and continue the training of a particular model (e.g., because of run priority), WandB logs these as separate runs!

The original loss curves are in this WandB project: https://wandb.ai/eleutherai/pythia But these logs are much harder to navigate. Again, we are collecting these for the smaller models as well and will share these in a CSV file on this github repo.

@itsnamgyu
Copy link
Author

Thanks! This will be a huge help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants