Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROCm] Optionally use hipblaslt #120551

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

trixirt
Copy link

@trixirt trixirt commented Feb 24, 2024

The hipblaslt package is not available on Fedora.
Instead of requiring the package, make it optional. If it is found, define the preprocessor variable HIPBLASLT Convert the checks for ROCM_VERSION >= 50700 to HIPBLASLT checks

Fixes #119081

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang

Copy link

pytorch-bot bot commented Feb 24, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120551

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 3 New Failures, 2 Unrelated Failures

As of commit 8edc7b9 with merge base bab4b5a (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@janeyx99 janeyx99 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 28, 2024
@jeffdaily
Copy link
Collaborator

@xw285cornell would appreciate your review of this. I'm assuming this PR will break your internal build?

The hipblaslt package is not available on Fedora.
Instead of requiring the package, make it optional.
If it is found, define the preprocessor variable HIPBLASLT
Convert the checks for ROCM_VERSION >= 507000 to HIPBLASLT checks

Signed-off-by: Tom Rix <[email protected]>
@trixirt
Copy link
Author

trixirt commented Mar 2, 2024

Update for a couple more hipblaslt usages that were added in main this week.

@FelixSchwarz
Copy link

This PR looks pretty straight forward and using a variable instead of "magic" version numbers seem to be much cleaner. It would be nice if this PR wouldn't linger around much longer.

@jeffdaily
Copy link
Collaborator

@trixirt I am in favor of this PR. My apologies for adding yet more exposure to hipblaslt APIs that you need to work around again. Please resolve conflicts likely due to #122106.

Copy link
Collaborator

@jithunnair-amd jithunnair-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest USE_HIPBLASLT instead of HIPBLASLT as name for define

@IMbackK
Copy link
Contributor

IMbackK commented Jun 10, 2024

considering that the basis issue of this pr #119081 is now correctly recognized as a bug i think it would be good to not leave this lingering much longer.

@xw285cornell
Copy link
Contributor

Sorry just see this. Thanks @jeffdaily for pinging, this will break our internal codebase but it should be an easy fix. I'm not objecting the idea, if you can ping me before this PR lands, I can put a fix to our internal system easily.

@jithunnair-amd jithunnair-amd added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 1, 2024
@jithunnair-amd jithunnair-amd changed the title Optionally use hipblaslt [ROCm] Optionally use hipblaslt Jul 1, 2024
@pytorch-bot pytorch-bot bot added the module: rocm AMD GPU support for Pytorch label Jul 1, 2024
@jithunnair-amd jithunnair-amd added the rocm priority high priority ROCm PRs from performance or other aspects label Jul 1, 2024
@jithunnair-amd
Copy link
Collaborator

@trixirt Please resolve conflicts and I'll request an upstream maintainer to approve.

@jithunnair-amd
Copy link
Collaborator

Suggest USE_HIPBLASLT instead of HIPBLASLT as name for define

@trixirt Please do consider this renaming to be aligned with current naming

@trixirt
Copy link
Author

trixirt commented Jul 1, 2024

I am working on this. Its a bit involved and in a parallel track i trying to get hipblastlt to build on Fedora.

Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@trixirt
Copy link
Author

trixirt commented Jul 4, 2024

I have submitted the hipBLASLt package for Fedora here
hipBLASLt package review

A prelim refactoring of the above change is here refactored for 2.4

Fedora/My preference is to use the package once it is available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/rocm ciflow/trunk Trigger trunk jobs on your pull request module: rocm AMD GPU support for Pytorch open source rocm priority high priority ROCm PRs from performance or other aspects triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ROCm loses some supported GPUs by requiring hipblaslt
9 participants