Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assert self.queue_size % batch_size == 0 # for simplicity #101

Closed
scarydemon2 opened this issue Sep 23, 2022 · 2 comments
Closed

assert self.queue_size % batch_size == 0 # for simplicity #101

scarydemon2 opened this issue Sep 23, 2022 · 2 comments

Comments

@scarydemon2
Copy link

我自己在进行blip_pretrain的时候,模型会经常在几百个step之后报告assert self.queue_size % batch_size == 0 # for simplicity错误,我想知道哪种情况可能会触发这种错误。我使用的是distributeddataparallel,在4个gpu上4进程训练

When I am doing blip_pretrain myself, the model will often report
“ assert self.queue_size % batch_size == 0 # for simplicity ”error after a few hundred steps,
I want to know which situation may trigger this error.
I am using distributeddataparallel with 4 process training on 4 gpus

@scarydemon2
Copy link
Author

我发现

@EuterpeK
Copy link

请问是怎么解决的呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants