Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error when trying to train raindrop classification on multiple gpu #147

Closed
max6457 opened this issue Jun 22, 2023 · 9 comments · Fixed by #149 or #150
Closed

error when trying to train raindrop classification on multiple gpu #147

max6457 opened this issue Jun 22, 2023 · 9 comments · Fixed by #149 or #150
Labels
bug Something isn't working

Comments

@max6457
Copy link

max6457 commented Jun 22, 2023

1. System Info + Information

system info: torch 2.0.1, pypots 0.1.1 - gpu: 8x RTX 4090
problem: when training the raindrop model as usual i wanted to make use of all my gpus. I did everything as in the documentation but got the following error after changing the device variable to a list-
Thanks a lot!

2. Reproduction

raindrop = Raindrop( n_steps = X.shape[1], n_features = X.shape[2], ... num_workers = 8, ... device = ['cuda:0', 'cuda:1'], ... )

Unbenannt

4. Expected behavior

no error

@max6457 max6457 added the bug Something isn't working label Jun 22, 2023
@WenjieDu
Copy link
Owner

Hi there 👋,

Thank you so much for your attention to PyPOTS! If you find PyPOTS helpful to your work, please star⭐️ this repository. Your star is your recognition, which can help more people notice PyPOTS and grow PyPOTS community. It matters and is definitely a kind of contribution to the community.

I have received your message and will respond ASAP. Thank you for your patience! 😃

Best,
Wenjie

@WenjieDu
Copy link
Owner

Hey Max, thank you for reporting this issue! Please allow me to make a confirmation with you first, Raindrop can run smoothly with a single GPU on your machine but it failed when on multiple GPUs. Right?

@max6457
Copy link
Author

max6457 commented Jun 25, 2023

Hey Max, thank you for reporting this issue! Please allow me to make a confirmation with you first, Raindrop can run smoothly with a single GPU on your machine but it failed when on multiple GPUs. Right?

Yes, that's right. On one GPU it runs without errors (when I just set device='cuda'). Thank's for your answer and your work :)

@WenjieDu
Copy link
Owner

I just pushed a commit to branch fix_raindrop to fix this bug. I've tested on my local machine. Could you please try it as well and then give me your feedback? Please first install code from the given branch with the command pip install https://github.com/WenjieDu/PyPOTS/archive/fix_raindrop.zip then run your test.

@max6457
Copy link
Author

max6457 commented Jun 30, 2023 via email

@WenjieDu
Copy link
Owner

Many thanks. After confirming it works well for you, I'll merge PR #149 into the main branch for the next release.

@WenjieDu
Copy link
Owner

WenjieDu commented Jul 4, 2023

Hi Max, did you have a chance to give it a shot?

@max6457
Copy link
Author

max6457 commented Jul 4, 2023 via email

@WenjieDu
Copy link
Owner

WenjieDu commented Jul 4, 2023

Great! Thanks for your reply. Will merge this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants