Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shorter segments? #15

Closed
ronyfadel opened this issue Feb 22, 2023 · 7 comments
Closed

Shorter segments? #15

ronyfadel opened this issue Feb 22, 2023 · 7 comments

Comments

@ronyfadel
Copy link

Would it be possible to produce shorter segments? (some are way too long)

@guillaumekln
Copy link
Contributor

There is no option that can effectively prevent this. The parameter length_penalty can help to some extent but it will not force the model to predict a shorter segment.

Do you get a different output with openai/whisper? If yes, it would be great if you can provide a way to reproduce the output.

@ronyfadel
Copy link
Author

There's been discussions in openai/whisper where you could skew the model to output shorter segments by tweaking max_text_token_logprob: openai/whisper#435 (reply in thread)

Is something similar with the codebase in faster-whisper?

@ronyfadel
Copy link
Author

I just saw the addition of length_penalty today. How should it be used? Its default value is set to 1.

@ronyfadel
Copy link
Author

@guillaumekln from my testing, I've also had great results using the token_timestamps flag here

Tbh, I don't know what CTranslate2 does to the underlying model, and if such capabilities are lost because the model was transformed.

@guillaumekln
Copy link
Contributor

At this time we did not implement any features or parameters that are not available in the reference implementation from openai/whisper. So currently there are no easy ways for users to tweak max_text_token_logprob or enable token-level timestamps, which would require changes to the C++ implementation in CTranslate2.

Regarding word-level timestamps, I'm following this development in the openai/whisper repo. If it is merged, I will look to support it here as well.

Also, you can ignore my comment regarding length_penalty. It is not relevant to your issue since you want the model to output more timestamps and not make the generated sequences shorter.

@guillaumekln
Copy link
Contributor

I just merged the word-level timestamps branch so the segments can now be as short as you want.

@stephanedebove
Copy link

stephanedebove commented Jun 20, 2024

hi @guillaumekln do you mind explaining what you mean by "I just merged the word-level timestamps branch so the segments can now be as short as you want."?

How do we control their length now?

And why a couple of months after this reply you said here #452 (comment) that "There is no option to control the segment length."?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants