Skip to content
View marac519's full-sized avatar

Block or report marac519

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,378 166 Updated Oct 18, 2024

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,039 269 Updated Oct 16, 2024

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Jupyter Notebook 491 54 Updated Sep 11, 2023

Instant voice cloning by MIT and MyShell.

Python 29,605 2,906 Updated Aug 21, 2024

Inference and training library for high-quality TTS models.

Python 4,549 459 Updated Oct 30, 2024
Jupyter Notebook 7,646 537 Updated Jun 16, 2024

AI powered speech denoising and enhancement. Adapted for windows and optimized

Python 60 6 Updated Jul 12, 2024

Noise supression using deep filtering

Python 2,513 231 Updated Oct 17, 2024

Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?

Jupyter Notebook 977 38 Updated May 13, 2024

Upsampling Artifacts in Neural Audio Synthesis – https://arxiv.org/abs/2010.14356

Jupyter Notebook 76 4 Updated Feb 9, 2021

End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions

Python 86 6 Updated Nov 6, 2023

πŸ”Š Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧πŸ‘₯πŸ“Š Advanced audio processing.

Python 203 19 Updated Jun 10, 2024

The official implementation of HierSpeech++

Python 1,178 134 Updated Feb 20, 2024

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

Python 1,178 72 Updated Jul 16, 2024

AI powered speech denoising and enhancement

Python 1,390 138 Updated Jun 21, 2024

Faster Tortoise inference then Tortoise Fast Fork

Jupyter Notebook 122 9 Updated Apr 21, 2024

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Python 2,172 143 Updated Nov 1, 2024

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 4,909 410 Updated Aug 10, 2024

Flutter tools for Tizen

Dart 460 66 Updated Nov 1, 2024

A rough and ready Python utility which splits audio files based on silence and desired min/max chunk duration.

Python 15 5 Updated Jun 22, 2022

Versatile audio super resolution (any -> 48kHz) with AudioSR.

Python 1,146 111 Updated May 10, 2024

Public voice datasets used for our Text-to-Speech voices.

29 4 Updated Jul 31, 2024

TorToiSe fine-tuning with DLAS

Python 215 107 Updated Aug 1, 2024

Adaptive Vocoder for Custom Voice

Python 58 10 Updated Sep 22, 2022

Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

7,487 908 Updated Aug 21, 2024

A multi-voice TTS system trained with an emphasis on quality

Jupyter Notebook 13,151 1,817 Updated Aug 19, 2024

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Python 6,585 971 Updated Aug 5, 2024

πŸ”₯ 2D and 3D Face alignment library build using pytorch

Python 7,069 1,348 Updated Aug 30, 2024

The source code of "DINet: deformation inpainting network for realistic face visually dubbing on high resolution video."

Python 985 174 Updated Sep 25, 2023
Next