Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid keeping redundant copies of model weights in memory during load #42

Merged
merged 3 commits into from
Sep 23, 2022

Conversation

drdaxxy
Copy link
Contributor

@drdaxxy drdaxxy commented Sep 22, 2022

Before this change, whisper.load_model keeps the whole weights file in scope until after tensors are moved to the device.

This might not seem important for sample code, but this change makes it possible to load the large model on a free Colab instance, which otherwise runs out of memory (inference afterwards is no problem)!

@nlgtuankiet
Copy link

I can confirm that it works!

Before: using large model on a free Colab instance -> Out of memory error
After: using large model on a free Colab instance -> Run successfully!

I'm not sure if these changes affect the output quality, can someone make a comparison to make sure?

@jongwook
Copy link
Collaborator

jongwook commented Sep 23, 2022

This is great! I wasn't thinking about Colab memory limits.

Let me make some cosmetic changes, test on my side, and then merge. Thanks!

@jongwook jongwook merged commit f296bcd into openai:main Sep 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants