Skip to content
View MiyazonoKaori137's full-sized avatar

Highlights

  • Pro

Block or report MiyazonoKaori137

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

AI-Speech

28 repositories

A PyTorch-based Speech Toolkit

Python 8,705 1,377 Updated Oct 11, 2024

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Python 1,136 96 Updated Oct 11, 2024

Text-to-Audio/Music Generation

Python 2,263 177 Updated Sep 29, 2024

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Python 2,015 319 Updated Nov 14, 2023

An unofficial PyTorch implementation of the audio LM VALL-E

Python 2,944 417 Updated May 10, 2023

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Python 1,271 99 Updated Sep 24, 2023

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Python 2,406 221 Updated Jun 2, 2024

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Jupyter Notebook 1,547 340 Updated Apr 22, 2024

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Python 1,924 506 Updated Jul 27, 2024

End-to-End Speech Processing Toolkit

Python 8,378 2,172 Updated Oct 10, 2024

Muzic: Music Understanding and Generation with Artificial Intelligence

Python 4,495 439 Updated Oct 12, 2024

A collection of neural vocoders suitable for singing voice synthesis tasks.

Python 94 9 Updated Sep 10, 2024
Python 407 59 Updated Oct 10, 2024

unofficial vits2-TTS implementation in pytorch

Python 478 86 Updated Mar 28, 2024

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 4,836 401 Updated Aug 10, 2024

Instant voice cloning by MIT and MyShell.

Python 29,116 2,851 Updated Aug 21, 2024

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Python 4,597 586 Updated Aug 9, 2024

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 596 42 Updated Sep 9, 2024

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 6,278 672 Updated Oct 11, 2024

speech self-supervised representations

Python 462 36 Updated Apr 27, 2023

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 35,658 4,192 Updated Aug 19, 2024

Repository for training models for music source separation.

Python 418 55 Updated Oct 10, 2024

Inference and training library for high-quality TTS models.

Python 4,343 440 Updated Sep 23, 2024

Port of OpenAI's Whisper model in C/C++

C 34,969 3,565 Updated Oct 8, 2024

Faster Whisper transcription with CTranslate2

Python 11,840 988 Updated Aug 21, 2024

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Python 352 39 Updated Sep 13, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 5,456 561 Updated Sep 29, 2024

Multilingual Voice Understanding Model

Python 2,916 272 Updated Sep 25, 2024