Fp8 support for item() with cuda, index_select, and fill_ with cpu #128780

ajbrent · 2024-06-15T22:48:15Z

Fixes #128370.
Fixes #128257.

Added fp8 support for item with cuda, index_select, and fill_ with cpu.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

…cpu), and fill_out.

…nto float8-support

pytorch-bot · 2024-06-15T22:48:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/128780

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (4 Unrelated Failures)

As of commit 82f63cf with merge base e3a39d4 ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

periodic / linux-focal-cuda12.4-py3.10-gcc9 / test (default, 3, 5, linux.4xlarge.nvidia.gpu) (gh) (detected as infra flaky with no log or failing log classifier)
periodic / linux-focal-cuda12.4-py3.10-gcc9 / test (deploy, 1, 1, linux.4xlarge.nvidia.gpu) (gh) (similar failure)
'Test'
periodic / win-vs2019-cuda11.8-py3 / test (default, 1, 4, windows.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
RuntimeError: inductor/test_pattern_matcher 1/1 failed!
periodic / win-vs2019-cuda11.8-py3 / test (default, 4, 4, windows.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
test_decomp 25/25 failed!

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vkuzo · 2024-06-17T16:01:19Z

aten/src/ATen/native/cpu/FillKernel.cpp

@@ -60,7 +60,7 @@ void fill_kernel(TensorIterator& iter, const Scalar& value_scalar) {
 [=]() -> scalar_t { return value; },
 [=]() { return Vectorized<scalar_t>(value); });
 }),
- AT_EXPAND(AT_ALL_TYPES_AND_COMPLEX), kBool, AT_EXPAND(AT_BAREBONES_UNSIGNED_TYPES)
+ AT_EXPAND(AT_ALL_TYPES_AND_COMPLEX), kBool, AT_EXPAND(AT_FLOAT8_TYPES), AT_EXPAND(AT_BAREBONES_UNSIGNED_TYPES)


is this still needed after https://github.com/pytorch/pytorch/pull/128744/files ?

vkuzo · 2024-06-17T16:03:31Z

test/test_torch.py

@@ -3553,7 +3553,9 @@ def test_index_fill(self, device, dtype):
 # FIXME: move to test indexing
 # The test fails for zero-dimensional tensors on XLA
 @onlyNativeDeviceTypes
- @dtypes(*all_types_and_complex_and(torch.half, torch.bool, torch.bfloat16))
+ @dtypes(*all_types_and_complex_and(torch.half, torch.bool, torch.bfloat16,
+ torch.float8_e4m3fn, torch.float8_e4m3fnuz,


optional nit: maybe save the float8 types in a list and unpack everywhere? will make it easier for future op support PRs to not forget to handle all of them

vkuzo · 2024-06-17T16:04:08Z

looks great, thanks! Just had one question inline on whether the fill code changes are still needed.

ajbrent and others added 6 commits June 15, 2024 12:27

Adding fp8 support for item (cuda), index_select, equal (cpu), fill (…

eaa673f

…cpu), and fill_out.

Adding fp8 test support for item, index_select, and fill.

3552bd9

Linting fix.

a18ce20

Merge branch 'pytorch:main' into float8-support

481ae96

Renaming non-numpy types.

ed33ae1

Merge branch 'float8-support' of https://github.com/ajbrent/pytorch i…

ca6f17c

…nto float8-support

ajbrent requested a review from eqy as a code owner June 15, 2024 22:48

pytorch-bot bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Jun 15, 2024

pytorchbot added the open source label Jun 15, 2024

mikaylagawarecki requested review from vkuzo and drisspg June 17, 2024 15:43

mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 17, 2024

vkuzo reviewed Jun 17, 2024

View reviewed changes

eqy approved these changes Jun 17, 2024

View reviewed changes

eqy added ciflow/trunk Trigger trunk jobs on your pull request ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR labels Jun 17, 2024

ajbrent and others added 3 commits June 17, 2024 23:41

Packaging float8 types.

8d213aa

Removing unnecessary float8 type inclusion.

540e436

Merge branch 'pytorch:main' into float8-support

82f63cf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fp8 support for item() with cuda, index_select, and fill_ with cpu #128780

Fp8 support for item() with cuda, index_select, and fill_ with cpu #128780

ajbrent commented Jun 15, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Jun 15, 2024 •

edited

Loading

vkuzo Jun 17, 2024

ajbrent Jun 18, 2024

vkuzo Jun 17, 2024

vkuzo commented Jun 17, 2024

Fp8 support for item() with cuda, index_select, and fill_ with cpu #128780

Are you sure you want to change the base?

Fp8 support for item() with cuda, index_select, and fill_ with cpu #128780

Conversation

ajbrent commented Jun 15, 2024 • edited by pytorch-bot bot Loading

pytorch-bot bot commented Jun 15, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/128780

✅ You can merge normally! (4 Unrelated Failures)

vkuzo Jun 17, 2024

Choose a reason for hiding this comment

ajbrent Jun 18, 2024

Choose a reason for hiding this comment

vkuzo Jun 17, 2024

Choose a reason for hiding this comment

vkuzo commented Jun 17, 2024

ajbrent commented Jun 15, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Jun 15, 2024 •

edited

Loading