Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove hallucinations from silent audio #1588

Closed
wants to merge 2 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Update whisper.cpp
  • Loading branch information
ex3ndr committed Dec 3, 2023
commit 8f6c0498665c9a3816e2913008d21f778d3375ee
2 changes: 1 addition & 1 deletion whisper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4589,7 +4589,7 @@ static void whisper_process_logits(

// suppress sot and nosp tokens
logits[vocab.token_sot] = -INFINITY;
logits[vocab.token_nosp] = -INFINITY; // TODO: ignore this token for now
// logits[vocab.token_nosp] = -INFINITY; // Uncommenting this would produce hallucinations on silent audio
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it is said that token_nosp is the direction to solve the hallucination, it is definitely problematic for you to cancel the suppress token_nosp. First of all, we only hope that the output of the model contains meaningful and visible tokens (except for timestamps). Your cancellation of suppress token_nosp will cause this token to possibly appear in the output of the model, which is something we do not want to see. Secondly, the key to solving hallucination lies in finding a way to skip silence. token_nosp is used to tell you how likely it is that this segment is silent, so that we can skip silence. Therefore, merely cancelling suppress token_nosp without any other action cannot solve hallucination.


// [TDRZ] when tinydiarize is disabled, suppress solm token
if (params.tdrz_enable == false) {
Expand Down
Loading