Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

CombinedLoader does not work in DDP when using max_size_cyle option #10373

Closed
ant0nsc opened this issue Nov 5, 2021 · 4 comments 路 Fixed by #10374
Closed

CombinedLoader does not work in DDP when using max_size_cyle option #10373

ant0nsc opened this issue Nov 5, 2021 · 4 comments 路 Fixed by #10374
Assignees
Labels
bug Something isn't working help wanted Open to be worked on priority: 1 Medium priority task

Comments

@ant0nsc
Copy link
Contributor

ant0nsc commented Nov 5, 2021

馃悰 Bug

When using a CombinedLoader with the max_size_cycle option and DDP, all the GPUs get all validation data.
This bug is related to #7013 - however, the fix in PR #7102 only affect the default min_size option of the CombinedLoader

@tchaton ?

To Reproduce

Repro

Expected behavior

For the above repro, the validation data has length 8. I would expect that each of the 2 GPUs only get 4 batches each, but in fact they get 8 batches.

Environment

  • CUDA:
    - GPU:
    - Tesla K80
    - Tesla K80
    - available: True
    - version: 10.2
  • Packages:
    - numpy: 1.21.2
    - pyTorch_debug: False
    - pyTorch_version: 1.8.0
    - pytorch-lightning: 1.5.0
    - tqdm: 4.62.3
  • System:
    - OS: Linux
    - architecture:
    - 64bit
    -
    - processor: x86_64
    - python: 3.7.3
    - version: 18.04.1-Ubuntu SMP Wed Jul 28 23:14:18 UTC 2021

Additional context

@ant0nsc ant0nsc added bug Something isn't working help wanted Open to be worked on labels Nov 5, 2021
@tchaton tchaton added the priority: 1 Medium priority task label Nov 5, 2021
@tchaton tchaton self-assigned this Nov 5, 2021
@tchaton
Copy link
Contributor

tchaton commented Nov 5, 2021

Dear @ant0nsc,

Thanks for raising this issue, it was in fact never supported. Currently looking into it.

Best,
T.C

@ant0nsc
Copy link
Contributor Author

ant0nsc commented Nov 5, 2021

Thanks @tchaton ! is there a way how we can work around the issue in the meantime?

@tchaton
Copy link
Contributor

tchaton commented Nov 5, 2021

Dear @ant0nsc,

I have a fix PR #10374. Would you mind trying it out?

Best,
T.C

@ant0nsc
Copy link
Contributor Author

ant0nsc commented Nov 5, 2021

Wow, that was swift, thanks @tchaton ! I tried on my small repro, it works just fine now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on priority: 1 Medium priority task
Projects
None yet
2 participants