Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataLoader docs update to describe how workers are managed, including Windows. #18091

Closed
wants to merge 3 commits into from

Conversation

mike9ant
Copy link
Contributor

It's been hard to understand how workers are launched and what code runs in the worker vs. main process, especially on Windows, which leads to many of our samples failing. This explains when workers run an how to make code work on Windows as well.

@mike9ant mike9ant requested review from soumith and ssnl March 15, 2019 23:26
Copy link
Collaborator

@ssnl ssnl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the note. Although a bunch of this is about how to work with python multiprocessing, I think it is still great to have it. :)

torch/utils/data/dataloader.py Outdated Show resolved Hide resolved
torch/utils/data/dataloader.py Outdated Show resolved Hide resolved
child workers to access the dataset and argument functions directly through the
cloned address space, on Windows another Python interpreter is launched which runs
your main script, followed by the internal worker function. Windows worker functions
receive dataset, collate_fn and other arguments through Pickle serialization.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every argument should be formatted as :attr:`xxx`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which arguments are you referring to - function names? What will that do? I think it looks good enough at this point :)

torch/utils/data/dataloader.py Outdated Show resolved Hide resolved
torch/utils/data/dataloader.py Outdated Show resolved Hide resolved
torch/utils/data/dataloader.py Show resolved Hide resolved
@ssnl
Copy link
Collaborator

ssnl commented Apr 25, 2019

@soumith I'm fine with merging this. I'm redo-ing the dataloader doc in #19228 and will incorporate this in.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@soumith merged this pull request in 698103c.

zhangguanheng66 pushed a commit to zhangguanheng66/pytorch that referenced this pull request May 6, 2019
… Windows. (pytorch#18091)

Summary:
It's been hard to understand how workers are launched and what code runs in the worker vs. main process, especially on Windows, which leads to many of our samples failing. This explains when workers run an how to make code work on Windows as well.
Pull Request resolved: pytorch#18091

Differential Revision: D15083766

Pulled By: soumith

fbshipit-source-id: 8a7e60defc8a72ec63874f657d7d5267d951dccf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants