-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Amazon Transcribe format for multiple channels (not the same as speakers) #235
Comments
Hi @gittes The AWS adapter, same as many of the other adapters have been made thanks to community OS contributions. See PR #120 Remove the incremental counterto remove the incremental counter this line should be changed packages/stt-adapters/amazon-transcribe/index.js#L140 -speaker: paragraph.speaker ? `Speaker ${ paragraph.speaker }` : `TBC ${ i }`,
+speaker: paragraph.speaker ? `Speaker ${ paragraph.speaker }` : `U_UKN`, Doesn't have to be AWS AdapterThere's a guide on how to make one from scratch under docs/guides/adapters.md for context and the code for the existing AWS one is at packages/stt-adapters/amazon-transcribe AWS 2 channels json formatTo accommodate that it be a matter of modifying the AWS STT Adapter in a way that
Don't want to speak for @jamesdools and @emettely but I am guessing a PR would be welcome, if you got the time/capacity? As a side note, at the moment I am mostly working on this alternative version pietrop/slate-transcript-editor. It doesn't provide any adapters as part of the core components, but I've extracted some of the adapters from this module, eg pietrop/aws-to-dpe, pietrop/gcp-to-dpe for when that type of conversion might be needed, eg working with AWS STT, or Google STT. |
Found your wonderful software, but had minor issue when loading an Amazon Transcribe transcript that had the variant format for independent audio channels as oppose to the typical speakers format.
Impressively, your software still loaded the rows of the transcript correctly, however, it made every speaker label have a unique number suffix, so it was impossible to relabel the speaker labels all at once and almost insurmountable task to track and correct by hand a very long transcript.
It's used when each speakers are each on a dedicated channel/track in the source audio file:
https://docs.aws.amazon.com/transcribe/latest/dg/how-channel-id.html
Excerpt from referred AWS doc showing the JSON format:
It has "channel_labels" (object) -> "channels" (array/list) ->"channel" (object) with each channel containing it's own "items" for words oppose to "items" being declared once in the other format and uses "channel_label" instead of "speaker_label" for speakers.
Could you please accommodate the Amazon Transcribe channel format variant and at least have speaker ID labels be consistent per channel if not matching the "channel_label?"
Just for reference here's the doc for speaker identification format:
https://docs.aws.amazon.com/transcribe/latest/dg/how-diarization.html
The text was updated successfully, but these errors were encountered: