the loss of pythia training #97

Wangpeiyi9979 · 2023-04-28T00:52:14Z

Hi, Is there any information about the loss of Pythia pre-training process like LLaMA?

haileyschoelkopf · 2023-04-28T14:51:22Z

https://wandb.ai/eleutherai/pythia?workspace=user-schoelkopf

Hi! We have a public wandb board (linked above) for our training runs. if you filter this by runs without "crashed" status and runs that lasted longer than 1 hour, then runs named "v2-MODELSIZE" should be what you want (I can help point out specific runs if needed, it is on my todo list to clean this up.)

Wangpeiyi9979 · 2023-05-01T16:18:00Z

Thanks

xiaoda99 · 2023-11-10T08:11:06Z

Hi, I can only find runs for 160M model in the above wandb link. And loading the many pages is very slow. Could you provide a cleaned versoin of runs for v2 6.9B and 12B (withou dedup)? I'm doing experiments on some new transformer architectures and want to compare my training loss results with Pythia's

Wangpeiyi9979 closed this as completed May 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the loss of pythia training #97

the loss of pythia training #97

Wangpeiyi9979 commented Apr 28, 2023

haileyschoelkopf commented Apr 28, 2023

Wangpeiyi9979 commented May 1, 2023

xiaoda99 commented Nov 10, 2023

the loss of pythia training #97

the loss of pythia training #97

Comments

Wangpeiyi9979 commented Apr 28, 2023

haileyschoelkopf commented Apr 28, 2023

Wangpeiyi9979 commented May 1, 2023

xiaoda99 commented Nov 10, 2023