Lists (1)
Sort Name ascending (A-Z)
Stars
Vector (and Scalar) Quantization, in Pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995
Notebooks for the Practicals at the Deep Learning Indaba 2024.
lina-speech : linear attention based text-to-speech
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
This is a simple ComfyUI custom TTS node based on Parler_tts.
A list of scripts/notebooks I'd like to keep handy
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
asyncio (PEP 3156) Redis support
Deep Learning Audio Course, 2023
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
🦝 OpenAPI plugin for generating API reference docs in Docusaurus v3.
Easy to maintain open source documentation websites.
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…
HuBERT content encoders for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3