Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for reducing across the middle dimension for 3D matrices using the sum Triton kernel #2297

Closed
wants to merge 1 commit into from

Conversation

jananisriram
Copy link
Contributor

Summary: Support reducing 3-dimensional matrices across the middle dimension (dim == 1) such that the result is of dimensions (M, K). This kernel assumes that BLOCK_SIZE_M == 1, as Triton is currently unable to perform reductions on a middle dimension, and that that the entire reduction dimension of the tensor fits in a thread block (BLOCK_SIZE_N >= N).

Reviewed By: davidberard98

Differential Revision: D58307854

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58307854

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58307854

jananisriram added a commit to jananisriram/benchmark that referenced this pull request Jun 12, 2024
…using the sum Triton kernel (pytorch#2297)

Summary:
Pull Request resolved: pytorch#2297

Support reducing 3-dimensional matrices across the middle dimension (`dim == 1`) such that the result is of dimensions `(M, K)`. This kernel assumes that `BLOCK_SIZE_M == 1`, as Triton is currently unable to perform reductions on a middle dimension, and that that the entire reduction dimension of the tensor fits in a thread block (`BLOCK_SIZE_N >= N`).

Reviewed By: davidberard98

Differential Revision: D58307854
…using the sum Triton kernel (pytorch#2297)

Summary:
Pull Request resolved: pytorch#2297

Support reducing 3-dimensional matrices across the middle dimension (`dim == 1`) such that the result is of dimensions `(M, K)`. This kernel assumes that `BLOCK_SIZE_M == 1`, as Triton is currently unable to perform reductions on a middle dimension, and that that the entire reduction dimension of the tensor fits in a thread block (`BLOCK_SIZE_N >= N`).

Reviewed By: davidberard98

Differential Revision: D58307854
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D58307854

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 55c975e.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants