-
05:11
(UTC -12:00) - https://qinhsiu.github.io
Block or Report
Block or report QinHsiu
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseAwesome-TTS
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
Official PyTorch implementation of BigVGAN (ICLR 2023)
Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
A deep neural network architecture for low-latency audio processing
so-vits-svc fork with realtime support, improved interface and more features.
A simple GUI application that slices audio with silence detection
SoftVC VITS Singing Voice Conversion
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
AcademiCodec: An Open Source Audio Codec Model for Academic Research
A Python wrapper for the high-quality vocoder "World"
🔊 Text-Prompted Generative Audio Model
Robust Speech Recognition via Large-Scale Weak Supervision
The source code of our paper "Diffsound: discrete diffusion model for text-to-sound generation"
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
AudioLDM: Generate speech, sound effects, music and beyond, with text.
A library for audio and music analysis, feature extraction.
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
SpeechGPT Series: Speech Large Language Models
Core Engine of Singing Voice Conversion & Singing Voice Clone
singing voice change based on whisper, and lora for singing voice clone
PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.