Look up tensor device member inside Tensor is_pinned() implementation instead of accepting an outside input #128988

daulet-askarov · 2024-06-18T19:47:59Z

🚀 The feature, motivation and pitch

Background: We recently found a bug where our cpu_test_tensor.is_pinned() call unexpectedly consumed considerable memory. The original cpu_test_tensor was on the device="cpu". It turned out, however, that is_pinned accepts an optional device input param and by default the internal implementation assumes device="cuda", so possibly Cuda context was created consuming additional memory. We are fixing the immediate problem with PR 128896 by passing device=cpu_tensor.device to is_pinned call.

Proposal: Drop the optional device input to is_pinned() and instead just have it automatically look up the device member field from the self Tensor object itself. This should make for a more robust API where user avoids being defaulted to a possibly incorrect device value.

Alternatives

No response

Additional context

See D58687049 for a repro of the original memory problem.

cc @ptrblck @msaroufim

albanD · 2024-06-19T11:54:34Z

A pinned Tensor is always on CPU and pinned wrt an accelerator (old default being cuda). Asking is a Tensor is pinned wrt to cpu doesn't really make sense.

The proper fix here would be to check if the cuda context is not initialized. If it is not, then it is not possible for the Tensor to be pinned and so we can just return False.

daulet-askarov · 2024-06-19T16:39:43Z

A pinned Tensor is always on CPU and pinned wrt an accelerator (old default being cuda). Asking is a Tensor is pinned wrt to cpu doesn't really make sense.

The proper fix here would be to check if the cuda context is not initialized. If it is not, then it is not possible for the Tensor to be pinned and so we can just return False.

Oh, I see. So the device input param to is_pinned is not really the tensor device but the device with respect to which it is pinned if any. We should probably update the docs for it. Regarding checking if the Cuda context exists and short circuit returning false makes total sense. Would you be able to make this change for is_pinned?

daulet-askarov · 2024-06-19T16:40:18Z

Reopening

albanD added module: cuda Related to torch.cuda, and CUDA support in general triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jun 19, 2024

daulet-askarov closed this as completed Jun 19, 2024

daulet-askarov reopened this Jun 19, 2024

LucasLLC mentioned this issue Jul 1, 2024

Re-implement pin_memory to be device-agnostic by leveraging the Accelerator concept #126376

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Look up tensor device member inside Tensor is_pinned() implementation instead of accepting an outside input #128988

Look up tensor device member inside Tensor is_pinned() implementation instead of accepting an outside input #128988

daulet-askarov commented Jun 18, 2024 •

edited by pytorch-bot bot

Loading

albanD commented Jun 19, 2024

daulet-askarov commented Jun 19, 2024

daulet-askarov commented Jun 19, 2024

Look up tensor device member inside Tensor is_pinned() implementation instead of accepting an outside input #128988

Look up tensor device member inside Tensor is_pinned() implementation instead of accepting an outside input #128988

Comments

daulet-askarov commented Jun 18, 2024 • edited by pytorch-bot bot Loading

🚀 The feature, motivation and pitch

Alternatives

Additional context

albanD commented Jun 19, 2024

daulet-askarov commented Jun 19, 2024

daulet-askarov commented Jun 19, 2024

daulet-askarov commented Jun 18, 2024 •

edited by pytorch-bot bot

Loading