Stars
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Text to speech alignment using CTC forced alignment
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
A file server that supports static serving, uploading, searching, accessing control, webdav...
Text-to-Music Generation with Rectified Flow Transformers
VITS with phoneme-level prosody modeling based on MaskGIT
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - Provide C++ & Python API
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
Official Pytorch Implementation of Our CVPR2023 Paper: "Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization"
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
TorchCFM: a Conditional Flow Matching library
Training code for FAcodec presented in NaturalSpeech3
bdashore3 / flash-attention
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Inference code for Audiodec-Valle-Wenetspeech4TTS
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)
Multilingual Voice Understanding Model
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Fast inference engine for Transformer models
mobiusml / faster-whisper
Forked from SYSTRAN/faster-whisperFaster Whisper ASR transcription with CTranslate2
Faster Whisper transcription with CTranslate2
Fast and memory-efficient exact attention
Foundational Models for State-of-the-Art Speech and Text Translation