Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diarization process with faster-whisper #236

Open
RustX2802 opened this issue May 24, 2024 · 1 comment
Open

Diarization process with faster-whisper #236

RustX2802 opened this issue May 24, 2024 · 1 comment

Comments

@RustX2802
Copy link

I have a working application with real-time transcription feature based on faster-whisper.
However, after applying diart pipeline to my existing application, I get transcription with no diarization.
I expect the output of this audio to be as follows:

Expected output

Speaker 0:

Hi. It's Pat. Can I help you?

Speaker 1:

Well, not really.

Speaker 0:

Okay. And what Is this Brandy?

Speaker 1:

Just say there's somebody on the line that needs help?

Speaker 0:

No. Is this Brandy?

Speaker 1:

Yeah?

Speaker 0:

Yeah. Hi. It's Pat.

Actual output:

hi it's pat can i help you uh well not really okay just say there's somebody on the line that needs help no is this brandy yeah yeah hi it's pat

It looks like the diart is not working as expected with faster-whisper, resulting in the output not being properly labeled with speaker information.

Can anybody confirm if this is the case?

@juanmc2005
Copy link
Owner

Hi @RustX2802, faster-whisper is not supported yet, I'm assuming you implemented it manually? Could you share the part of the code where you align the transcription and diarization?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants