Add jagged_sum operator for unpadded nested tensors to TritonBench #2299
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
Add a
jagged_sum
reduction operator for unpadded nested tensors, based on the PyTorchsum
operator, to TritonBench. This diff implements a basic benchmark for reducing along the ragged dimension for 3-dimensional nested tensors. For a 3-dimensional tensor of shape(B, *, M)
, where*
is the ragged dimension, this benchmark uses PyTorch'ssum
operator to reduceB
(*, M)
2-dimensional tensors to a(B, M)
output tensor.Measure performance of basic benchmark with
gbps
andlatency
metrics and display nested tensor parametersB
andM
.Reviewed By: YuqingJ
Differential Revision: D58396957