- GuangZhou, CN
Block or Report
Block or report wmlgl
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Create code snippets, browse AI prompts, create extension icons and more.
Free and Open Source API and drivers for immersive technology.
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
axinc-ai / whisper-export
Forked from zhuzilin/whisper-openvinoopenvino version of openai/whisper
zhuzilin / whisper-openvino
Forked from openai/whisperopenvino version of openai/whisper
The collection of pre-trained, state-of-the-art AI models for ailia SDK
Robust Speech Recognition via Large-Scale Weak Supervision
Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 serve…
Real-time speech recognition using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Raspberry Pi, VisionFive2, LicheePi4A etc.
Port of OpenAI's Whisper model in C/C++
Custom nodes pack for ComfyUI This custom node helps to conveniently enhance images through Detector, Detailer, Upscaler, Pipe, and more.
🛋 The AI and Generative Art platform for everyone
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
web voice changer sample by web api and tone.js
VirtualWife是一个虚拟数字人项目,支持B站直播,支持openai、ollama
The core of the motion capture part of OpenLive3D
Face and Body Tracking for VRM 3D models on the web.
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
A simple VITS HTTP API, developed by extending Moegoe with additional features.
Fine-Tuning your VITS model using a pre-trained model
VRM HTML5 Viewer with VMD motion files support
vits2 backbone with multilingual-bert
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
CjangCjengh / vits
Forked from jaywalnut310/vitsVITS implementation of Japanese, Chinese, Korean, Sanskrit and Thai