-
Notifications
You must be signed in to change notification settings - Fork 22k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possble to add a registerable api at the beginning of torch.save #117840
Comments
@jbschlosser is PrivateUse1 the right label for such issues? |
@ezyang could u please take a look at this? Thanks! |
Can you explain to me how torch.save works today? IIUC, wouldn't we convert the tensor to cpu first before writing it out? |
During pytorch/torch/serialization.py Lines 828 to 832 in ee1dbb2
After then, pytorch/torch/serialization.py Lines 858 to 862 in ee1dbb2
However, it means
|
Naively, why doesn't the storage conversion to cpu preserve the metadata? |
Sorry, I may not have been clear at first. Our storage data may be larger than in the CPU. Here So we want to add registerable apis and allow our |
Why not preserve the padding when you convert to cpu? |
Because we want our saved |
This doesn't seem like a big deal to me? If you have padding, but the tensor metadata (e.g., storage offset, size) is set appropriately, CPU tensor would handle it correctly. And if you actually convert to cpu e.g., with .cpu() we wouldn't preserve the padding in this case. |
At the present, we can only use patches to solve this mismatch problem like,
We hope there can be one such api so that we can do some preprocessing to the data. @ezyang |
@ezyang , what do you think? Can you give me some suggestions? |
In CPU land, I can have a CPU tensor that points to a storage buffer with padding before/after the tensor data. In this case, I have a non-zero storage offset. When I save this tensor, by default I include the padding. You can observe this in the following example:
Nothing you have told me thus far has therefore convinced me that what you want to do isn't expressible without a hook. |
In fact, our padding is more complex. We may modify the structure of data in a variety of ways depending on the scenario, not just adding padding at the end. As shown in the above figure, they are complex and hard to get actual data. Therefore, we wan to add an entrance so that we can restore data back when using |
The first padding scenario can be accurately expressed using strides which make you skip two blue widths. The second padding scenario can also be expressed by using strides to express transposition, and also expanding the stride to skip past the blue padding. I can give concrete code examples if you still don't get it. |
I know these can be expressed by some ways, but we do not intend to increase learning costs for our clients. We hope to maintain these specific operations implicitly. Regardless of which type of paddings we use, others don't need to add some skippings or strides explicitly. |
@ezyang, so what's ur opinions now? |
I don't think having accurate strides makes client code more complicated, IMO |
@ezyang, you're familiar with torch so it's not complicated to you, but maybe it's not for our clients. We just want to hand over code adaptation work by ourselves rather than clients. They only need to specify device as
It will not affect existing torch logic, but merely provide an interface to
|
In general, users are expected not to deal explicitly with strides. If I add two values with unusual striding, I will typically preserve this striding (but make the result contiguous). If you are doing work to automatically insert padding when you predict it is necessary, this would be the same as just making functions output non-standard strides differently than pytorch on cpu/cuda. Look, if you tell me, "Edward, I understand this is hypothetically better, but we already wrote lots of code managing different storage padding at the storage level, and we'd have to refactor all of it which we don't have time to do" I'd be like, OK, fine, this seems like a decent tradeoff to make if you've painted yourself into this situation, but then, what's so bad about an extra monkey patch on top. If we're putting something into PyTorch core, we're going to do it right, because we're going to have to maintain it forever afterwards. |
🚀 The feature, motivation and pitch
Similar to #99808, we may have some special data in our tensor. We hope our tensor can be loaded by other devices like
cpu
so we need to change our tensors before torch.save.Now we can only use monkey patches to do that for the only registerable step in torch.save is after data's(storage's) save.
I wonder if there can be a registerable api for
privateuse1
backend at the beginning of torch.save so that we can change our data from special struct to general form before saving the actual data.Alternatives
Add a registerable api at the beginning of torch.save, for examale
prepare_for_save
Additional context
No response
cc @mruberry @mikaylagawarecki
The text was updated successfully, but these errors were encountered: