Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custome collate function cannot get _use_shared_memory #17909

Closed
cy69855522 opened this issue Mar 12, 2019 · 4 comments
Closed

custome collate function cannot get _use_shared_memory #17909

cy69855522 opened this issue Mar 12, 2019 · 4 comments
Assignees
Labels
module: dataloader Related to torch.utils.data.DataLoader and Sampler triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@cy69855522
Copy link

cy69855522 commented Mar 12, 2019

Thx a lot for such a wonderful package~
But when I wrote a custom function for the Dataloader's collate_fn, I found I cannot get the value of variable - “_use_shared_memory” in torch.utils.data._utils.collate.py
ModuleNotFoundError: No module named 'torch.utils.data._utils'
I was thinking if you could give me some suggestions. I don't want to change the original code of pytorch locally.

@soumith
Copy link
Member

soumith commented Mar 13, 2019

cc: @ssnl

@ssnl
Copy link
Collaborator

ssnl commented Mar 13, 2019

Let’s expose that! I’ll add this field to the worker info in the upcoming PR.

@ssnl ssnl self-assigned this Mar 13, 2019
@cy69855522
Copy link
Author

Let’s expose that! I’ll add this field to the worker info in the upcoming PR.

Thx a lot~

@ssnl
Copy link
Collaborator

ssnl commented Apr 29, 2019

you can use torch.utils.data.get_worker_info() != None to get the same information (whether you are in a worker process) after #19228

@gchanan gchanan added module: dataloader Related to torch.utils.data.DataLoader and Sampler triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels May 1, 2019
pull bot pushed a commit to Pandinosaurus/pytorch that referenced this issue Jun 21, 2019
Summary:
This is a modified version of pytorch#14705 since commit structure for that PR is quite messy.

1. Add `IterableDataset`.
3. So we have 2 data loader mods: `Iterable` and `Map`.

    1. `Iterable` if the `dataset` is an instance of `IterableDataset`
    2. `Map` o.w.

3. Add better support for non-batch loading (i.e., `batch_size=None` and `batch_sampler=None`). This is useful in doing things like bulk loading.
3. Refactor `DataLoaderIter` into two classes, `_SingleProcessDataLoaderIter` and `_MultiProcessingDataLoaderIter`. Rename some methods to be more generic, e.g., `get_batch` -> `get_data`.
4. Add `torch.utils.data.get_worker_info` which returns worker information in a worker proc (e.g., worker id, dataset obj copy, etc.) and can be used in `IterableDataset.__iter__` and `worker_init_fn` to do per-worker configuration.
5. Add `ChainDataset`, which is the analog of `ConcatDataset` for `IterableDataset`.
7. Import torch.utils.data in `torch/__init__.py`
9. data loader examples and documentations
10. Use `get_worker_info` to detect whether we are in a worker process in `default_collate`

Closes pytorch#17909, pytorch#18096, pytorch#19946, and some of pytorch#13023
Pull Request resolved: pytorch#19228

Reviewed By: bddppq

Differential Revision: D15058152

fbshipit-source-id: 9e081a901a071d7e4502b88054a34b450ab5ddde
iotamudelta pushed a commit to ROCm/pytorch that referenced this issue Jun 21, 2019
Summary:
This is a modified version of pytorch#14705 since commit structure for that PR is quite messy.

1. Add `IterableDataset`.
3. So we have 2 data loader mods: `Iterable` and `Map`.

    1. `Iterable` if the `dataset` is an instance of `IterableDataset`
    2. `Map` o.w.

3. Add better support for non-batch loading (i.e., `batch_size=None` and `batch_sampler=None`). This is useful in doing things like bulk loading.
3. Refactor `DataLoaderIter` into two classes, `_SingleProcessDataLoaderIter` and `_MultiProcessingDataLoaderIter`. Rename some methods to be more generic, e.g., `get_batch` -> `get_data`.
4. Add `torch.utils.data.get_worker_info` which returns worker information in a worker proc (e.g., worker id, dataset obj copy, etc.) and can be used in `IterableDataset.__iter__` and `worker_init_fn` to do per-worker configuration.
5. Add `ChainDataset`, which is the analog of `ConcatDataset` for `IterableDataset`.
7. Import torch.utils.data in `torch/__init__.py`
9. data loader examples and documentations
10. Use `get_worker_info` to detect whether we are in a worker process in `default_collate`

Closes pytorch#17909, pytorch#18096, pytorch#19946, and some of pytorch#13023
Pull Request resolved: pytorch#19228

Reviewed By: bddppq

Differential Revision: D15058152

fbshipit-source-id: 9e081a901a071d7e4502b88054a34b450ab5ddde
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: dataloader Related to torch.utils.data.DataLoader and Sampler triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants