Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError using nested tensor in Apple M1 device MPS #127743

Open
QG-phy opened this issue Jun 3, 2024 · 3 comments
Open

RuntimeError using nested tensor in Apple M1 device MPS #127743

QG-phy opened this issue Jun 3, 2024 · 3 comments
Assignees
Labels
module: mps Related to Apple Metal Performance Shaders framework module: nestedtensor NestedTensor tag see issue #25032 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@QG-phy
Copy link

QG-phy commented Jun 3, 2024

🐛 Describe the bug

Traceback (most recent call last):
  File "/Users/aisiqg/Software/venv/pydptb/bin/dptb", line 8, in <module>
    sys.exit(main())
  File "/Users/aisiqg/Software/venv/pydptb/lib/python3.9/site-packages/dptb/entrypoints/main.py", line 408, in main
    train(**dict_args)
  File "/Users/aisiqg/Software/venv/pydptb/lib/python3.9/site-packages/dptb/entrypoints/train.py", line 223, in train
    trainer.run(trainer.train_options["num_epoch"])
  File "/Users/aisiqg/Software/venv/pydptb/lib/python3.9/site-packages/dptb/nnops/base_trainer.py", line 51, in run
    self.epoch()
  File "/Users/aisiqg/Software/venv/pydptb/lib/python3.9/site-packages/dptb/nnops/trainer.py", line 192, in epoch
    self.iteration(ibatch, next(iter(self.reference_loader)))
  File "/Users/aisiqg/Software/venv/pydptb/lib/python3.9/site-packages/dptb/nnops/trainer.py", line 84, in iteration
    batch = batch.to(self.device)
  File "/Users/aisiqg/Software/venv/pydptb/lib/python3.9/site-packages/dptb/utils/torch_geometric/data.py", line 322, in to
    self.apply(lambda x: x.to(device, **kwargs), *keys)
  File "/Users/aisiqg/Software/venv/pydptb/lib/python3.9/site-packages/dptb/utils/torch_geometric/data.py", line 306, in apply
    self[key] = torch.nested.as_nested_tensor(item)
  File "/Users/aisiqg/Software/venv/pydptb/lib/python3.9/site-packages/torch/nested/__init__.py", line 58, in as_nested_tensor
    return torch._nested_tensor_from_tensor_list(tensor_list, dtype, None, device, None)
RuntimeError: storage_device.is_cpu() || storage_device.is_cuda() || storage_device.is_privateuseone() INTERNAL ASSERT FAILED at "/Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/NestedTensorImpl.cpp":185, please report a bug to PyTorch. NestedTensorImpl storage must be either CUDA, CPU or privateuseone but got mps:0

Versions

Collecting environment information...
PyTorch version: 2.1.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 14.4.1 (x86_64)
GCC version: Could not collect
Clang version: 14.0.3 (clang-1403.0.22.14.1)
CMake version: version 3.22.2
Libc version: N/A

Python version: 3.9.7 (default, Sep 16 2021, 08:50:36) [Clang 10.0.0 ] (64-bit runtime)
Python platform: macOS-10.16-x86_64-i386-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M1 Pro

Versions of relevant libraries:
[pip3] flake8==6.1.0
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.24.4
[pip3] numpydoc==1.1.0
[pip3] torch==2.1.0
[pip3] torchaudio==2.1.0
[pip3] torchsort==0.1.7
[pip3] torchvision==0.16.0
[conda] blas 1.0 mkl
[conda] mkl 2021.4.0 hecd8cb5_637
[conda] mkl-service 2.4.0 py39h9ed2024_0
[conda] mkl_fft 1.3.1 py39h4ab4a9b_0
[conda] mkl_random 1.2.2 py39hb2f4e1b_0
[conda] numpy 1.24.4 pypi_0 pypi
[conda] numpydoc 1.1.0 pyhd3eb1b0_1
[conda] torch 2.1.0 pypi_0 pypi
[conda] torchaudio 2.1.0 pypi_0 pypi
[conda] torchsort 0.1.7 pypi_0 pypi
[conda] torchvision 0.16.0 pypi_0 pypi

cc @cpuhrsch @jbschlosser @bhosmer @drisspg @soulitzer @kulinseth @albanD @malfet @DenisVieriu97 @jhavukainen

@cpuhrsch cpuhrsch added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: mps Related to Apple Metal Performance Shaders framework labels Jun 3, 2024
@albanD albanD added module: nestedtensor NestedTensor tag see issue #25032 labels Jun 3, 2024
@soulitzer
Copy link
Contributor

soulitzer commented Jun 3, 2024

@QG-phy could you try if layout=torch.jagged works for you?

@jhavukainen jhavukainen self-assigned this Jun 12, 2024
@skotapati skotapati assigned skotapati and unassigned jhavukainen Jun 18, 2024
@skotapati
Copy link
Collaborator

skotapati commented Jun 18, 2024

@QG-phy thank you for filing this issue. This is related to an op which is not yet supported by the PyTorch mps backend. Could you try rerunning your script with the environment variable PYTORCH_ENABLE_MPS_FALLBACK=1

@OYE93
Copy link

OYE93 commented Jul 6, 2024

hey @skotapati thanks for the suggestion
I tried to set PYTORCH_ENABLE_MPS_FALLBACK=1 to resolve the same issue, it seems it doesn't work, I still see the same error

RuntimeError: storage_device.is_cpu() || storage_device.is_cuda() || storage_device.is_xpu() || storage_device.is_privateuseone() INTERNAL ASSERT FAILED at "/Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/NestedTensorImpl.cpp":184, please report a bug to PyTorch. NestedTensorImpl storage must be either CUDA, CPU, XPU or privateuseone but got mps:0

could you pls advise any other solutions for these nested tensor specific issue, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: mps Related to Apple Metal Performance Shaders framework module: nestedtensor NestedTensor tag see issue #25032 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

7 participants