ABC0408

🐢

Focusing

Лэюань ABC0408

🐢

Focusing

Less is More

8 followers · 172 following

Starred repositories

myshell-ai / DreamVoice

Python 21 2 Updated Aug 26, 2024

bfs18 / e2_tts

Python 39 6 Updated Sep 3, 2024

huggingface / speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Python 2,924 311 Updated Sep 5, 2024

QwenLM / Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,013 53 Updated Aug 13, 2024

innnky / MagVITS

VITS with phoneme-level prosody modeling based on MaskGIT

Python 71 7 Updated Aug 31, 2024

aiola-lab / whisper-medusa

Whisper with Medusa heads

Python 762 47 Updated Aug 4, 2024

OpenT2S / LlamaVoice

LlamaVoice is a llama-based large voice generation model, providing inference and training ability.

Python 153 9 Updated Aug 26, 2024

NaruseMioShirakana / DragonianVoice

多个SVC/TTS的C++推理库

C 980 118 Updated Aug 10, 2024

NVIDIA-AI-IOT / whisper_trt

A project that optimizes Whisper for low latency inference using NVIDIA TensorRT

Python 44 8 Updated Jul 3, 2024

xi-j / Mamba-TasNet

Jupyter Notebook 45 3 Updated Jul 16, 2024

facebookresearch / audioseal

Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector

Python 403 47 Updated Aug 28, 2024

hyama5 / vae_align

Alignment examples for Interspeech 2024

10 Updated Jul 5, 2024

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 4,576 461 Updated Sep 6, 2024

snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 3,928 389 Updated Aug 22, 2024

pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 5,863 751 Updated Aug 19, 2024

google / speaker-id

This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.

Python 342 40 Updated Sep 5, 2024

MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook 3,283 272 Updated Sep 5, 2024

segment-any-text / wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.

Python 674 39 Updated Sep 2, 2024

Camb-ai / MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

Jupyter Notebook 2,423 195 Updated Aug 1, 2024

sanchit-gandhi / whisper-jax

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Jupyter Notebook 4,341 366 Updated Apr 3, 2024

tuanh123789 / Train_Hifigan_XTTS

This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.

Python 53 18 Updated Aug 9, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 30,428 3,305 Updated Sep 4, 2024

zhenye234 / xcodec

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Python 61 3 Updated Sep 2, 2024

huutuongtu / Lightvoc

LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM

Jupyter Notebook 16 2 Updated May 17, 2024

liutaocode / TTS-arxiv-daily

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Python 204 18 Updated Sep 8, 2024

zhenye234 / FlashSpeech

FlashSpeech: Efficient Zero-Shot Speech Synthesis

61 1 Updated Jul 30, 2024

aask1357 / hilcodec

High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec

Jupyter Notebook 62 6 Updated May 23, 2024

modelscope / FunClip

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

Python 3,222 336 Updated Aug 22, 2024

jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Python 1,473 167 Updated Aug 28, 2024

rhasspy / piper

A fast, local neural text to speech system

C++ 5,714 408 Updated Aug 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly