(简体中文|English)
The directory containes many speech applications in multi scenarios.
- audio tagging - multi-label tagging of an audio file
- automatic_video_subtitiles - generate subtitles from a video
- metaverse - 2D AR with TTS
- punctuation_restoration - restore punctuation from raw text
- speech recogintion - recognize text of an audio file
- speech translation - end to end speech translation
- story talker - book reader based on OCR and TTS
- style_fs2 - multi style control for FastSpeech2 model
- text_to_speech - convert text into speech