-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot use Patch Queue together with multi-GPU training (via pytorch-lightning) #890
Comments
More than a bug, should be considered as a request for enhancement. It requires to review the torchio code and make it able to "pickled", so that can be sent to the subprocesses of pytorch-lightning. Have you tried using another strategy for multi-gpu, like "dpp" or "deepspeed" with pytorch-lightning? I was able to start a training in both cases. The only drawback is that there number of dataloaders/queue is duplicated for each of the processes that are created this way (which is linked to the number of gpus) |
I'm having the same issue, when using: and train_patches_queue = tio.Queue( val_patches_queue = tio.Queue( in mine DataLoader: batch_size = 2 train_loader = torch.utils.data.DataLoader(train_patches_queue, batch_size=batch_size, num_workers=8) I'm using a single GPU with 6GB |
The tutorial works fine with Lightning. If anyone is having this issue, can you please share a minimal reproducible example? |
Is there an existing issue for this?
Bug summary
I am testing to use Pytorch-lightning to handle model training (easy to use multi-gpu training and other training tricks) while using the TorchIO as dataloader. But, I always get errors.
Code for reproduction
Actual outcome
this is just a pseudocode
Error messages
Expected outcome
I hope to use torchio dataloader in a multi-gpu training script
System info
The text was updated successfully, but these errors were encountered: