Stars
zero-shot voice conversion & singing voice conversion with in context learning
A plugin for multilingual translation of ComfyUI,This plugin implements translation of resident menu bar/search bar/right-click context menu/node, etc
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, th…
Command-line interface and Python library to transcribe pinyin to IPA. The tones are attached to the vowel of the syllable.
g2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese
It is a multi-lingual (97 languages) text content automatic recognition and segmentation tool. 强大的TTS多语言(97种语言)混合文本内容自动分词工具。
A pipeline from Dataset Gathering,Data annotations, Model training,Model Evaluation for viseme (visual sound phoneme) classification
Code for paper "Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition"
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)
本项目是基于Pytorch的语音合成项目,使用的是VITS,VITS是一种语音合成方法,这种时端到端的模型使用起来非常简单,不需要文本对齐等太复杂的流程,直接一键训练和生成,大大降低了学习门槛。
FAST-LIVO2: Fast, Direct LiDAR-Inertial-Visual Odometry
a guide to grapheme-to-phoneme conversion and phoneme list for ace singing voice synthesis engine
基于transformer的ocr识别,在公章(印章识别, seal recognition)拓展应用
Python Package for Airborne RGB machine learning
Pytorch reimplementation of audio driven face mesh or blendshape models, including Audio2Mesh, VOCA, etc
Multi-speaker Speech Synthesis Using VITS(KO, JA, EN, ZH)
vits chinese, tts chinese, tts mandarin 史上训练最简单,音质最好的语音合成系统
A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)