Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the loss of pythia training #97

Closed
Wangpeiyi9979 opened this issue Apr 28, 2023 · 3 comments
Closed

the loss of pythia training #97

Wangpeiyi9979 opened this issue Apr 28, 2023 · 3 comments

Comments

@Wangpeiyi9979
Copy link

Hi, Is there any information about the loss of Pythia pre-training process like LLaMA?
image

@haileyschoelkopf
Copy link
Collaborator

https://wandb.ai/eleutherai/pythia?workspace=user-schoelkopf

Hi! We have a public wandb board (linked above) for our training runs. if you filter this by runs without "crashed" status and runs that lasted longer than 1 hour, then runs named "v2-MODELSIZE" should be what you want (I can help point out specific runs if needed, it is on my todo list to clean this up.)

@Wangpeiyi9979
Copy link
Author

Thanks

@xiaoda99
Copy link

Hi, I can only find runs for 160M model in the above wandb link. And loading the many pages is very slow. Could you provide a cleaned versoin of runs for v2 6.9B and 12B (withou dedup)? I'm doing experiments on some new transformer architectures and want to compare my training loss results with Pythia's

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants