-
-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scalene doesnt work properly with torchrun / torch.distributed.run #823
Comments
You can use |
You might also try specifying |
Actually I had tried this as well, even the person who asked the question on stackoverflow also did try that. But still it gave error |
I got error while running scalene with torch.distributed.run .
I am currently following this doc
this command runs perfectly, but when i replace the python -m with scalene, it raises error. I think the main issue is my train_mz.py takes other arguments as input from command line. and scalene is prolly passing them as args to torch.distributed.run.main() function.
although this is just a speculation.
Also there is very similar stackoverflow question on exactly similar lines.
It would be really nice if someone could help me out here. Thanks
The text was updated successfully, but these errors were encountered: