Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hooks for execution on intel gaudi devices - 1 #128584

Closed

Conversation

ankurneog
Copy link
Contributor

Motivation

This is follow up to PR:#126970 to support Gaudi devices for Pytorch UT execution.

Changes

We are adding additional hooks to:

  1. Add dtype exceptions for Gaudi/HPU
  2. Extend onlyNativeDevices decorator functionality to add additional devices

@ankurneog ankurneog requested a review from mruberry as a code owner June 13, 2024 04:36
Copy link

pytorch-bot bot commented Jun 13, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/128584

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit c397cd1 with merge base 91a8376 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@ankurneog ankurneog marked this pull request as draft June 13, 2024 04:42
@ankurneog ankurneog marked this pull request as ready for review June 13, 2024 04:45
@ankurneog ankurneog marked this pull request as draft June 13, 2024 04:58
@ankurneog ankurneog marked this pull request as ready for review June 13, 2024 05:00
@ankurneog
Copy link
Contributor Author

@albanD : Kindly help with the review and merge

@ankurneog
Copy link
Contributor Author

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased ankurneog_hpu_updates_phase2 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout ankurneog_hpu_updates_phase2 && git pull --rebase)

@ankurneog ankurneog force-pushed the ankurneog_hpu_updates_phase2 branch from b9b7132 to 673a20a Compare June 14, 2024 10:18
@soulitzer soulitzer requested a review from albanD June 14, 2024 13:20
@soulitzer soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 14, 2024
@ankurneog
Copy link
Contributor Author

@albanD : could you kindly help with the review and merge. Thank you.

@ankurneog
Copy link
Contributor Author

@albanD : gentle reminder, could you please help with this PR, Thanks.

@ankurneog ankurneog force-pushed the ankurneog_hpu_updates_phase2 branch from 4ce9bd0 to c397cd1 Compare July 3, 2024 03:17
@ankurneog
Copy link
Contributor Author

@albanD : can you please help with the review

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is most likely to be brittle given that you don't have CI signal for it.

tbh if your CI cannot live in PT, I think a good long term plan is for the hpu CI spec to live where the CI runs. There you will be able to pin PyTorch and move forward carefully while always having a known good version.
But I guess that now that you have this scaffolding, you can already update these attributes on the OpInfos directly from your own repo before running the tests?

@ankurneog
Copy link
Contributor Author

@albanD : thanks for the approval, yes that's the goal , after this change is in, we will update opinfo data with gaudi op capabilities, that way we ensure its clean.

@ankurneog
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 19, 2024
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@ankurneog
Copy link
Contributor Author

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Jul 19, 2024
@ankurneog
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@ankurneog
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

DiweiSun pushed a commit to DiweiSun/pytorch that referenced this pull request Jul 22, 2024
## Motivation
This is follow up to PR:pytorch#126970  to support Gaudi devices for Pytorch UT execution.

## Changes
We are adding additional hooks to:
1. Add dtype exceptions for Gaudi/HPU
2. Extend onlyNativeDevices decorator  functionality to add additional devices

Pull Request resolved: pytorch#128584
Approved by: https://github.com/albanD
@ankurneog ankurneog deleted the ankurneog_hpu_updates_phase2 branch July 22, 2024 17:23
xuhancn pushed a commit to xuhancn/pytorch that referenced this pull request Jul 25, 2024
## Motivation
This is follow up to PR:pytorch#126970  to support Gaudi devices for Pytorch UT execution.

## Changes
We are adding additional hooks to:
1. Add dtype exceptions for Gaudi/HPU
2. Extend onlyNativeDevices decorator  functionality to add additional devices

Pull Request resolved: pytorch#128584
Approved by: https://github.com/albanD
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants