DataLoader docs update to describe how workers are managed, including Windows. #18091

mike9ant · 2019-03-15T23:25:45Z

It's been hard to understand how workers are launched and what code runs in the worker vs. main process, especially on Windows, which leads to many of our samples failing. This explains when workers run an how to make code work on Windows as well.

torch/utils/data/dataloader.py

ssnl

Thanks for the note. Although a bunch of this is about how to work with python multiprocessing, I think it is still great to have it. :)

torch/utils/data/dataloader.py

ssnl · 2019-03-16T03:13:35Z

torch/utils/data/dataloader.py

+ child workers to access the dataset and argument functions directly through the
+ cloned address space, on Windows another Python interpreter is launched which runs
+ your main script, followed by the internal worker function. Windows worker functions
+ receive dataset, collate_fn and other arguments through Pickle serialization.


Every argument should be formatted as :attr:`xxx`

Which arguments are you referring to - function names? What will that do? I think it looks good enough at this point :)

torch/utils/data/dataloader.py

…up for Windows.

ssnl · 2019-04-25T18:09:57Z

@soumith I'm fine with merging this. I'm redo-ing the dataloader doc in #19228 and will incorporate this in.

facebook-github-bot

@soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2019-04-27T01:10:22Z

@soumith merged this pull request in 698103c.

… Windows. (pytorch#18091) Summary: It's been hard to understand how workers are launched and what code runs in the worker vs. main process, especially on Windows, which leads to many of our samples failing. This explains when workers run an how to make code work on Windows as well. Pull Request resolved: pytorch#18091 Differential Revision: D15083766 Pulled By: soumith fbshipit-source-id: 8a7e60defc8a72ec63874f657d7d5267d951dccf

mike9ant requested review from soumith and ssnl March 15, 2019 23:26

vadimkantorov reviewed Mar 16, 2019

View reviewed changes

torch/utils/data/dataloader.py Outdated Show resolved Hide resolved

ssnl reviewed Mar 16, 2019

View reviewed changes

mike9ant added 3 commits April 24, 2019 14:58

Updated DataLoader docs to describe worker behaviour and portable set…

78b452f

…up for Windows.

DataLoader worker related docs update, fix indentation.

0ef88fd

Update DataLoader worker notes based on feedback and Sphynx testing

cf73181

mike9ant force-pushed the dataloader_comment_update branch from eb34e4c to cf73181 Compare April 24, 2019 22:01

soumith approved these changes Apr 25, 2019

View reviewed changes

facebook-github-bot reviewed Apr 25, 2019

View reviewed changes

facebook-github-bot closed this in 698103c Apr 26, 2019

facebook-github-bot added the merged label Apr 27, 2019

ezyang added the open source label Jun 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataLoader docs update to describe how workers are managed, including Windows. #18091

DataLoader docs update to describe how workers are managed, including Windows. #18091

mike9ant commented Mar 15, 2019

ssnl left a comment

ssnl Mar 16, 2019

mike9ant Apr 24, 2019

ssnl commented Apr 25, 2019

facebook-github-bot left a comment

facebook-github-bot commented Apr 27, 2019

DataLoader docs update to describe how workers are managed, including Windows. #18091

DataLoader docs update to describe how workers are managed, including Windows. #18091

Conversation

mike9ant commented Mar 15, 2019

ssnl left a comment

Choose a reason for hiding this comment

ssnl Mar 16, 2019

Choose a reason for hiding this comment

mike9ant Apr 24, 2019

Choose a reason for hiding this comment

ssnl commented Apr 25, 2019

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Apr 27, 2019