To use DistributedSampler or not? #1541

abhishekkrthakur · 2020-01-17T19:58:56Z

In pytorch-xla documentation: http:https://pytorch.org/xla/, it doesnt mention use of distributed sampler.
However, in the example : https://github.com/pytorch/xla/blob/master/test/test_train_mp_mnist.py , it says we should be using distributed samples.

xm.RateTracker() isnt mentioned in the documentation either.

Are both correct?

Also, Is there a way to use iterable datasets with distributed samplers?

jysohn23 · 2020-01-17T20:12:23Z

You want to use distributed samplers when using the multiprocessing API (or TPU Pods training) since they don't share memory. So yes that example is correct.

Also yes, xm.RateTracker() is used in our examples.

Not sure if that'd work since to shard the dataset, which is what distributed sampler is doing, it needs to know the length of the entire dataset in advance. Check this discussion out: pytorch/pytorch#28743.

stale · 2020-02-16T20:16:02Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

jysohn23 self-assigned this Jan 17, 2020

stale bot added the stale Has not had recent activity label Feb 16, 2020

abhishekkrthakur closed this as completed Feb 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

To use DistributedSampler or not? #1541

To use DistributedSampler or not? #1541

abhishekkrthakur commented Jan 17, 2020

jysohn23 commented Jan 17, 2020

stale bot commented Feb 16, 2020

To use DistributedSampler or not? #1541

To use DistributedSampler or not? #1541

Comments

abhishekkrthakur commented Jan 17, 2020

jysohn23 commented Jan 17, 2020

stale bot commented Feb 16, 2020