How to train this model/ #57

Junye-Chen · 2023-10-20T10:38:54Z

Thank you for your outstanding work.
I tried to train the model with the default hyperparameters except batchsize=3 on 3*rtx4090, but I found that the loss values corrupted to NaN at an early stage (about 4000 iters). I would like to have some advise on training skills. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to train this model/ #57

How to train this model/ #57

Junye-Chen commented Oct 20, 2023 •

edited

Loading

How to train this model/ #57

How to train this model/ #57

Comments

Junye-Chen commented Oct 20, 2023 • edited Loading

Junye-Chen commented Oct 20, 2023 •

edited

Loading