Replies: 3 comments 1 reply
-
Are you using Whisper? |
Beta Was this translation helpful? Give feedback.
1 reply
-
Redialing on this one, it seems that connection broke 😁 |
Beta Was this translation helpful? Give feedback.
0 replies
-
It's a known issue that Whisper is not very precise, see e.g. openai/whisper#139 Note that Purfview's Faster Whisper has implemented some improvements regarding this! I think you waveform with text looks pretty okay. You could perhaps extend all cues duration by 50 ms. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
As can be seen from the image, SE doesn't place subtitle at the beginning of the speech sometimes, and sometimes it does. It is more or less the same story for ending of subtitle (it ends too soon sometimes)
Sometimes there's a longer speech that ends up as two subtitles both having two rows, but one is 5 seconds, and the other is 2, meaning, first timing is too long and second becomes too short.
Is it possible to improve precision? Can it be done through options, switches? Is this language model dependent, or SE does post-processing of timings?
Beta Was this translation helpful? Give feedback.
All reactions