[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 596 42 Updated Sep 9, 2024

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 6,278 672 Updated Oct 11, 2024

auspicious3000 / contentvec

speech self-supervised representations

Python 462 36 Updated Apr 27, 2023

suno-ai / bark

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 35,658 4,192 Updated Aug 19, 2024

ZFTurbo / Music-Source-Separation-Training

Repository for training models for music source separation.

Python 418 55 Updated Oct 10, 2024

huggingface / parler-tts

Inference and training library for high-quality TTS models.

Python 4,343 440 Updated Sep 23, 2024

ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++

C 34,969 3,565 Updated Oct 8, 2024

SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2

Python 11,840 988 Updated Aug 21, 2024

KdaiP / StableTTS

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Python 352 39 Updated Sep 13, 2024

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 5,456 561 Updated Sep 29, 2024

FunAudioLLM / SenseVoice

Multilingual Voice Understanding Model

Python 2,916 272 Updated Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MiyazonoKaori137

Highlights

Block or report MiyazonoKaori137

AI-Speech

speechbrain / speechbrain

modelscope / 3D-Speaker

haoheliu / AudioLDM2

lifeiteng / vall-e

enhuiz / vall-e

lucidrains / naturalspeech2-pytorch

haoheliu / AudioLDM

kan-bayashi / ParallelWaveGAN

jik876 / hifi-gan

espnet / espnet

microsoft / muzic

openvpi / SingingVocoders

CNChTu / Diffusion-SVC

p0p4k / vits2_pytorch

yl4579 / StyleTTS2

myshell-ai / OpenVoice

myshell-ai / MeloTTS

ddlBoJack / emotion2vec