-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the loss of pythia training #97
Comments
https://wandb.ai/eleutherai/pythia?workspace=user-schoelkopf Hi! We have a public wandb board (linked above) for our training runs. if you filter this by runs without "crashed" status and runs that lasted longer than 1 hour, then runs named "v2-MODELSIZE" should be what you want (I can help point out specific runs if needed, it is on my todo list to clean this up.) |
Thanks |
Hi, I can only find runs for 160M model in the above wandb link. And loading the many pages is very slow. Could you provide a cleaned versoin of runs for v2 6.9B and 12B (withou dedup)? I'm doing experiments on some new transformer architectures and want to compare my training loss results with Pythia's |
Hi, Is there any information about the loss of Pythia pre-training process like LLaMA?
![image](https://user-images.githubusercontent.com/42565075/235028606-33ee9834-190c-482b-bc16-9d82f89c16cb.png)
The text was updated successfully, but these errors were encountered: