Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PyTorch as logits transpose for ONNX support #141

Merged
merged 1 commit into from
Sep 26, 2022
Merged

Conversation

mgoin
Copy link
Contributor

@mgoin mgoin commented Sep 26, 2022

Because Numpy was used for the final transpose for the logits output, torch.onnx.export would fail

/usr/local/lib/python3.7/dist-packages/torch/onnx/utils.py in _run_symbolic_function(g, block, n, inputs, env, operator_export_type)
   1420         else:
   1421             raise symbolic_registry.UnsupportedOperatorError(
-> 1422                 domain, op_name, opset_version
   1423             )
   1424     except RuntimeError:
UnsupportedOperatorError: Exporting the operator ::numpy_T to ONNX opset version 13 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub.

If this line at whisper/model.py:L192 is changed from
logits = (x @ self.token_embedding.weight.to(x.dtype).T).float()
to
logits = (x @ torch.transpose(self.token_embedding.weight.to(x.dtype), 0, 1)).float()
then the export succeeds!

Code to test export:

import whisper
import torch

tiny_model = whisper.load_model("tiny")
torch.onnx.export(tiny_model.encoder, torch.randn(1,80,3000).to("cuda"), "tiny/whisper-encoder.onnx")
torch.onnx.export(tiny_model.decoder, (torch.tensor([[50258]]).to("cuda"), torch.randn(1,384,384).to("cuda")), "tiny/whisper-decoder_main.onnx")
torch.onnx.export(tiny_model.decoder, (torch.tensor([[50258, 50259, 50359]]).to("cuda"), torch.randn(1, 384, 384).to("cuda")), "tiny/whisper-decoder_language.onnx")

@Y-T-G
Copy link

Y-T-G commented Sep 26, 2022

[[50258, 50259, 50359]]
@mgoin May I know where did you obtain these shapes from?

@jongwook
Copy link
Collaborator

Thanks for checking! I haven't tried ONNX but the change seems benign.

@jongwook jongwook merged commit 9c8183a into openai:main Sep 26, 2022
@mgoin
Copy link
Contributor Author

mgoin commented Sep 26, 2022

[[50258, 50259, 50359]] @mgoin May I know where did you obtain these shapes from?

@Y-T-G those shapes were taken just from some sample audio I ran through, printed the tensor shapes, and use them to make the dummy inputs.

Thanks for the accept!

@ArtyomZemlyak
Copy link

Hi!

It's not shapes, but it's sot_tokens:

  • 50258 - sot_token
  • 50259 - language token
  • 50359 - task token (50359 - just for trunscribe)

This 3 tokens formed here:

langs = tuple(LANGUAGES.keys())
sot_sequence = [sot]
if language is not None:
sot_sequence.append(sot + 1 + langs.index(language))
if task is not None:
sot_sequence.append(transcribe if task == "transcribe" else translate)
return Tokenizer(tokenizer=tokenizer, language=language, sot_sequence=tuple(sot_sequence))

@ArtyomZemlyak
Copy link

But i have problems with speed of overall model, as mentioned here #134

@mgoin If you can run ONNX version of model without botlenecks, it's would very go to see your inference code (or just know, that you has not problems with it, and all running good).

@nyadla-sys
Copy link

@ArtyomZemlyak ,can you please share the code to run inference on onnx files

@Y-T-G
Copy link

Y-T-G commented Oct 2, 2022

@mgoin I see. So I am assuming it would be a different values for different size of models.

@David19970306
Copy link

But i have problems with speed of overall model, as mentioned here #134

@mgoin If you can run ONNX version of model without botlenecks, it's would very go to see your inference code (or just know, that you has not problems with it, and all running good).

do u have the codes for running the onnx model file? coz i came across a problem how to convert logits to tokens we need.

@AntyRia
Copy link

AntyRia commented Apr 2, 2024

Because Numpy was used for the final transpose for the logits output, torch.onnx.export would fail

/usr/local/lib/python3.7/dist-packages/torch/onnx/utils.py in _run_symbolic_function(g, block, n, inputs, env, operator_export_type)
   1420         else:
   1421             raise symbolic_registry.UnsupportedOperatorError(
-> 1422                 domain, op_name, opset_version
   1423             )
   1424     except RuntimeError:
UnsupportedOperatorError: Exporting the operator ::numpy_T to ONNX opset version 13 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub.

If this line at whisper/model.py:L192 is changed from logits = (x @ self.token_embedding.weight.to(x.dtype).T).float() to logits = (x @ torch.transpose(self.token_embedding.weight.to(x.dtype), 0, 1)).float() then the export succeeds!

Code to test export:

import whisper
import torch

tiny_model = whisper.load_model("tiny")
torch.onnx.export(tiny_model.encoder, torch.randn(1,80,3000).to("cuda"), "tiny/whisper-encoder.onnx")
torch.onnx.export(tiny_model.decoder, (torch.tensor([[50258]]).to("cuda"), torch.randn(1,384,384).to("cuda")), "tiny/whisper-decoder_main.onnx")
torch.onnx.export(tiny_model.decoder, (torch.tensor([[50258, 50259, 50359]]).to("cuda"), torch.randn(1, 384, 384).to("cuda")), "tiny/whisper-decoder_language.onnx")

Hi, thank you for your contribution. I am a novice who has just come into contact with whisper model. I would like to ask whether I can save the entire whisper input pt as pth and then convert it to an onnx model. I made simple attempts, but they didn't seem to succeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants