add hotwords feature #2070

jax-explorer · 2024-03-08T03:32:10Z

hello!
During the transcription process, I often encounter some proprietary or new vocabulary, and Whisper cannot handle it well. I searched for solutions, and the community provided two options:

Fine-tuning the model: This approach is costly, and it's not practical to fine-tune the model every time a new term emerges.

Using initial_prompt: However, initial_prompt only applies to the first window. If specialized terms don't appear at the beginning, this method is ineffective.

Upon reviewing other transcription models, it's common practice to use hotwords. So, I implemented this feature. My approach is to add hotword-related prompts before each transcription window. Since there's a maximum length limit, I occupy the space previously used by the prefix. When the prefix isn't set, hotwords take effect. After testing, it indeed resolved the issue of specialized vocabulary in my scenario.

The following is the community discussion on this issue:
#1477
https://discuss.huggingface.co/t/adding-custom-vocabularies-on-whisper/29311
https://stackoverflow.com/questions/73833916/how-can-i-give-some-hint-phrases-to-openais-whisper-asr

jax-explorer · 2024-03-08T07:11:57Z

@jongwook hello， please check out this pr.

James-Shared-Studios · 2024-03-08T14:26:53Z

Would this be a duplicated effort since there is a parameter that serves the same purpose, condition_on_previous_text? if condition_on_previous_text set to True, the previous output of the model is provided as a prompt for the next window. Correct me if I'm wrong. Thank you.

jax-explorer · 2024-03-08T17:10:17Z

@James-Shared-Studios This isn't used to add context, it's used to add hot words when some new word or term comes up that makes whisper recognize it. for example:comfyUI is a new word, it is The most powerful and modular stable diffusion GUI and backend.If don't add hotwords, he won't be recognized correctly.

greduan · 2024-04-01T12:01:09Z

Have tried it with a video where the following words were misspelled

"Kalichain"
=>
"cl chain"
"cali chain"

"Kalicertif"
=>
"c cerff"
"cl ciff"
"Cali certif"

"Kalismarket"
=>
"C's Market"

"Kalishare"
=>
"Cali share"

"Kalistoken"
=>
"Cali's token"

"kijiji"
=>
"kiji"

And indeed it worked to make it so that these words were no longer misspelled with the following args:

whisper video.opus --hotwords "Kalichain, Kalicertif, Kalismarket, Kalishare, Kalistoken, kijiji, MEXC, Kalissa, FireHustle"

But it didn't work 100%, sometimes they were misspelled. Notably Kalicertif was misspelled as Kalistertif.

JiweiZh · 2024-04-08T05:54:35Z

So, by inputting a series of proper nouns through the hotwords method, what is the maximum length that can actually be supported? @jax-explorer

jax-explorer · 2024-04-12T06:39:23Z

@JiweiZh It depends on the n_text_ctx value in the model's dims.

sanghyun-son · 2024-07-01T00:45:52Z

@jax-explorer Hello, I find this commit very useful and hope this going to be merged soon. Currently, I'm using your forked repository to enjoy this feature. BTW, I have some questions about your implementation.

You say that you occupy spaces for prefix, but I'm not sure where the prefix comes from. Is condition_on_previous_text related to prefix?
Current implementation divide n_ctx by 2 and assign prompt and hotwords evenly. If I want to use hotwords more, is it valid to change n_ctx // 2 to some other numbers? For example, I would not use prompt and use hotwords only if we provide hotwords like below:

if (hotwords := self.options.hotwords) is not None:
    hotwords_tokens = self.tokenizer.encode(" " + hotwords.strip())
    hotwords_tokens = hotwords_tokens[: self.n_ctx]  # Use more hotwords
    tokens = (
        [self.tokenizer.sot_prev]
        + hotwords_tokens
        # + (prompt_tokens[-(self.n_ctx // 2 - 1) :] if self.options.prompt is not None else [])
        + tokens
    )

Thanks!

jax added 2 commits March 8, 2024 11:30

add hotwords feature

9cf2f99

remove log

5ed89d0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add hotwords feature #2070

add hotwords feature #2070

jax-explorer commented Mar 8, 2024

jax-explorer commented Mar 8, 2024

James-Shared-Studios commented Mar 8, 2024

jax-explorer commented Mar 8, 2024

greduan commented Apr 1, 2024

JiweiZh commented Apr 8, 2024

jax-explorer commented Apr 12, 2024

sanghyun-son commented Jul 1, 2024

add hotwords feature #2070

Are you sure you want to change the base?

add hotwords feature #2070

Conversation

jax-explorer commented Mar 8, 2024

jax-explorer commented Mar 8, 2024

James-Shared-Studios commented Mar 8, 2024

jax-explorer commented Mar 8, 2024

greduan commented Apr 1, 2024

JiweiZh commented Apr 8, 2024

jax-explorer commented Apr 12, 2024

sanghyun-son commented Jul 1, 2024