-
Notifications
You must be signed in to change notification settings - Fork 21.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow get attributes on DDP similar to FSDP #128620
base: main
Are you sure you want to change the base?
Conversation
@awgu Does |
@fegin I think that this change is generally okay, but I wonder what is the motivation. I am guessing to make it more similar to FSDP mainly? (Otherwise, probably everyone already is doing This approach of attribute forwarding can have some non-intuitive behavior in edge cases. For example, recently, I hit an issue where we try to access a property |
yeah @awgu there can be non-intuitive changes Also, in another PR, I want to add DDP.clip_grad_norm_ similar to FSDP.clip_grad_norm_ |
I would caution against this. DDP should just use |
makes sense @awgu. thanks for the comments |
I am not sure what the failing test here is @awgu |
Hi, pinging again for this issue |
@awgu is this still relevant? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is okay with me.
"""Forward missing attributes to the wrapped module.""" | ||
try: | ||
return super().__getattr__(name) # defer to nn.Module's logic | ||
except AttributeError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for future: If when trying to get a property attribute p
, p
's implementation raises an AttributeError
, and self.module
does not have p
either, then the error message here will be non-intuitive since it would say that self.module
does not have the attribute p
, where the useful error would be that accessing ddp_module.p
raised an AttributeError
when accessing .p
.
pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 4, 5, linux.4xlarge.nvidia.gpu) failure is unrelated (foreach ops test, whereas this is DDP) |
@pytorchbot merge -i |
Merge failedReason: This PR needs a If not, please add the To add a label, you can comment to pytorchbot, for example For more information, see Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 1 checks: pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 4, 5, linux.4xlarge.nvidia.gpu) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
It seems that this PR introduced the following error: https://github.com/pytorch/pytorch/actions/runs/9720692485/job/26833212318 |
@pytorchbot revert -m "Reverting in order to see if the trunk error on inductor is fixed" -c nosignal |
@pytorchbot successfully started a revert job. Check the current status here. |
@mayank31398 your PR has been successfully reverted. |
This reverts commit 065c386. Reverted #128620 on behalf of https://github.com/jeanschmidt due to Reverting in order to see if the trunk error on inductor is fixed ([comment](#128620 (comment)))
This PR was reopened (likely due to being reverted), so your approval was removed. Please request another review.
yikes!! @jeanschmidt |
This reverts commit 065c386. Reverted pytorch#128620 on behalf of https://github.com/jeanschmidt due to Reverting in order to see if the trunk error on inductor is fixed ([comment](pytorch#128620 (comment)))
FSDP implements the following logic but its missing from DDP.
This PR adds an equivalent function for the same.
cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang @d4l3k