Block or Report
Block or report CasonTsai
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Replicated and optimized community version of Advanced Locomotion System V4 for Unreal Engine 5.4 with additional features & bug fixes
MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
[CVPR 2023] Executing your Commands via Motion Diffusion in Latent Space, a fast and high-quality motion diffusion model
Official implementation for "Generating Diverse and Natural 3D Human Motions from Texts (CVPR2022)."
Drive your metahuman to speak within 1 second.
Foundational model for human-like, expressive TTS
A simple VITS HTTP API, developed by extending Moegoe with additional features.
Faster Tortoise inference then Tortoise Fast Fork
AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, D…
基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
Leading free and open-source face recognition system
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
chinese speech pretrained models
Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.
The code generate phoneme from audio features.
Real-time speech recognition using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Raspberry Pi, VisionFive2, LicheePi4A etc.
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.