Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't update duration if last timestamp is timestamp_begin #191

Merged

Conversation

vickianand
Copy link
Contributor

@vickianand vickianand commented Sep 29, 2022

There is bug causing start and end timestamps of the segments to be same when they shouldn't be.
If the only timestamp-token found is the tokenizer.timestamp_begin then we shouldn't update the duration (to 0). If there is really no speech, then anyway the segment won't be appended by the add_segment function.
Before this fix:
image
After this fix:
image

@vickianand vickianand changed the title Don't update duration if last timestamp is same as begin Don't update duration if last timestamp is timestamp_begin Sep 29, 2022
@jongwook jongwook merged commit 2b0c297 into openai:main Sep 29, 2022
@jongwook
Copy link
Collaborator

Thanks. I think it's another failure case where Whisper didn't sample any intermediate timestamp tokens. Would you be able to upload/email the audio file?

@vickianand
Copy link
Contributor Author

vickianand commented Sep 29, 2022

Here is the audio file: https://drive.google.com/file/d/1WG17h-3U565uWpl_jW2bkO3eS_l73H7f/view?usp=sharing
For the first three 30-sec chunks, it doesn't predict intermediate timestamps when run with:
--model medium.en --language en --initial_prompt "um, uh"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants