Avoid keeping redundant copies of model weights in memory during load #42

drdaxxy · 2022-09-22T19:52:11Z

Before this change, whisper.load_model keeps the whole weights file in scope until after tensors are moved to the device.

This might not seem important for sample code, but this change makes it possible to load the large model on a free Colab instance, which otherwise runs out of memory (inference afterwards is no problem)!

nlgtuankiet · 2022-09-23T01:55:08Z

I can confirm that it works!

Before: using large model on a free Colab instance -> Out of memory error
After: using large model on a free Colab instance -> Run successfully!

I'm not sure if these changes affect the output quality, can someone make a comparison to make sure?

jongwook · 2022-09-23T02:33:37Z

This is great! I wasn't thinking about Colab memory limits.

Let me make some cosmetic changes, test on my side, and then merge. Thanks!

don't keep copies of model weights in host memory

6c85b71

jongwook added 2 commits September 23, 2022 12:53

Merge branch 'main' into lower-memory-download

97a8127

adding type annotation

f5bc8c0

jongwook merged commit f296bcd into openai:main Sep 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid keeping redundant copies of model weights in memory during load #42

Avoid keeping redundant copies of model weights in memory during load #42

drdaxxy commented Sep 22, 2022

nlgtuankiet commented Sep 23, 2022

jongwook commented Sep 23, 2022 •

edited

Loading

Avoid keeping redundant copies of model weights in memory during load #42

Avoid keeping redundant copies of model weights in memory during load #42

Conversation

drdaxxy commented Sep 22, 2022

nlgtuankiet commented Sep 23, 2022

jongwook commented Sep 23, 2022 • edited Loading

jongwook commented Sep 23, 2022 •

edited

Loading