marac519

marac519

1 follower · 2 following

Stars

gpt-omni / mini-omni2

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,378 166 Updated Oct 18, 2024

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,039 269 Updated Oct 16, 2024

daniilrobnikov / vits2

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Jupyter Notebook 491 54 Updated Sep 11, 2023

myshell-ai / OpenVoice

Instant voice cloning by MIT and MyShell.

Python 29,605 2,906 Updated Aug 21, 2024

huggingface / parler-tts

Inference and training library for high-quality TTS models.

Python 4,549 459 Updated Oct 30, 2024

Vaibhavs10 / insanely-fast-whisper

Jupyter Notebook 7,646 537 Updated Jun 16, 2024

daswer123 / resemble-enhance-windows

Forked from resemble-ai/resemble-enhance

AI powered speech denoising and enhancement. Adapted for windows and optimized

Python 60 6 Updated Jul 12, 2024

Rikorose / DeepFilterNet

Noise supression using deep filtering

Python 2,513 231 Updated Oct 17, 2024

XiongjieDai / GPU-Benchmarks-on-LLM-Inference

Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?

Jupyter Notebook 977 38 Updated May 13, 2024

DolbyLaboratories / neural-upsampling-artifacts-audio

Upsampling Artifacts in Neural Audio Synthesis – https://arxiv.org/abs/2010.14356

Jupyter Notebook 76 4 Updated Feb 9, 2021

wonjune-kang / lvc-vc

End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions

Python 86 6 Updated Nov 6, 2023

IIEleven11 / StyleTTS2FineTune

Python 173 33 Updated Oct 2, 2024

davidmartinrius / speech-dataset-generator

🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.

Python 203 19 Updated Jun 10, 2024

sh-lee-prml / HierSpeechpp

The official implementation of HierSpeech++

Python 1,178 134 Updated Feb 20, 2024

chengzeyi / stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

Python 1,178 72 Updated Jul 16, 2024

resemble-ai / resemble-enhance

AI powered speech denoising and enhancement

Python 1,390 138 Updated Jun 21, 2024

manmay-nakhashi / tortoise-tts-fastest

Forked from 152334H/tortoise-tts-fast

Faster Tortoise inference then Tortoise Fast Fork

Jupyter Notebook 122 9 Updated Apr 21, 2024

predibase / lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Python 2,172 143 Updated Nov 1, 2024

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 4,909 410 Updated Aug 10, 2024

flutter-tizen / flutter-tizen

Flutter tools for Tizen

Dart 460 66 Updated Nov 1, 2024

gkrsv / split_audio

A rough and ready Python utility which splits audio files based on silence and desired min/max chunk duration.

Python 15 5 Updated Jun 22, 2022

haoheliu / versatile_audio_super_resolution

Versatile audio super resolution (any -> 48kHz) with AudioSR.

Python 1,146 111 Updated May 10, 2024

NabuCasa / voice-datasets

Public voice datasets used for our Text-to-Speech voices.

29 4 Updated Jul 31, 2024

152334H / DL-Art-School

Forked from neonbjb/DL-Art-School

TorToiSe fine-tuning with DLAS

Python 215 107 Updated Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly