Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix is_scalar_on_cpu bug #866

Merged

Conversation

zhaoguochun1995
Copy link
Collaborator

@zhaoguochun1995 zhaoguochun1995 commented Jul 2, 2024

Avoid treating the tensor on the device as a CPU scalar tensor. syncStream will be called internally by tensor.item(), which is costly. If the tensor on the CPU has only one element and its shape is not empty, such as [1], [1,1], [1,1,...], it should be treated as a Scalar and the corresponding kernel function should be called. There are places in core where a scalar is wrapped but not marked as wrapped.
see https://github.com/pytorch/pytorch/blob/8f70bf7a943799b5cd870952d39f36361de4b87f/torch/csrc/lazy/core/tensor.cpp#L386

@zhaoguochun1995 zhaoguochun1995 force-pushed the zgc/dipu_fix_is_scalar_on_cpu_bug branch from 59e7112 to 216e244 Compare July 3, 2024 07:48
Copy link
Collaborator

@lljbash lljbash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numel = 1 的标准和 pt 没对齐吧

@zhaoguochun1995
Copy link
Collaborator Author

zhaoguochun1995 commented Jul 4, 2024

numel = 1 的标准和 pt 没对齐吧

If the tensor on the CPU has only one element and its shape is not empty, such as [1], [1,1], [1,1,...], it should be treated as a Scalar and the corresponding kernel function should be called
我们这里主要是为了调用scalar版本的kernel函数。 tensor(shape=[1], device='cpu'), 这种情况下应该当做scalar处理(如diopiAddScalar),而不是当做设备tensor去处理

而且并没有没对齐,pt 里面 numel 也是返回的1 而不是0

@caikun-pjlab caikun-pjlab merged commit 89a35ad into DeepLink-org:main Jul 9, 2024
30 checks passed
Wrench-Git pushed a commit to DeepLink-org/deeplink.framework.dev that referenced this pull request Jul 10, 2024
* fix is_scalar_on_cpu bug

* Use is_scalar_on_cpu inline function instead of direct judgment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants