Skip to content
View lovemefan's full-sized avatar

Block or report lovemefan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995

Python 41 4 Updated Nov 6, 2024

Awesome music generation model——MG²

Python 86 9 Updated Nov 5, 2024

first base model for full-duplex conversational audio

Python 1,141 73 Updated Nov 6, 2024

Fast and accurate automatic speech recognition (ASR) for edge devices

Python 2,041 84 Updated Nov 5, 2024

The fastest digital human algorithm, now on your desktop.

Python 235 19 Updated Nov 6, 2024

PyTorch implementation of the Differential-Transformer architecture for sequence modeling, specifically tailored as a decoder-only model similar to large language models (LLMs). The architecture in…

Python 27 4 Updated Oct 27, 2024

一个超轻量级、可以在移动端实时运行的数字人模型

Python 796 134 Updated Nov 4, 2024

GLM-4-Voice | 端到端中英语音对话模型

Python 2,106 169 Updated Oct 31, 2024

Human Motion Video Generation: A Survey (https://www.techrxiv.org/users/836049/articles/1228135-human-motion-video-generation-a-survey)

87 4 Updated Nov 4, 2024

Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。

Python 1,482 177 Updated Nov 6, 2024
Python 6,651 506 Updated Oct 31, 2024
Python 165 13 Updated Sep 24, 2024

Pseudo Streaming SenseVoice with Hotwords

Python 73 11 Updated Nov 2, 2024

Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation

Python 3,507 492 Updated Nov 6, 2024

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 6,669 773 Updated Nov 5, 2024

[NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment

Python 2,547 178 Updated Nov 1, 2024

An Open-Sourced LLM-empowered Foundation TTS System

Python 415 29 Updated Oct 17, 2024

Open source inference code for Rev's model

Python 325 21 Updated Oct 28, 2024

Implementation of Liquid Nets in Pytorch

Python 51 7 Updated Nov 4, 2024

JoyHallo: Digital human model for Mandarin

Python 275 28 Updated Oct 8, 2024

开源的SSL证书管理工具,可以帮助你自动申请、部署SSL证书,并在证书即将过期时自动续期。An open-source SSL certificate management tool that helps you automatically apply for and deploy SSL certificates, as well as automatically renew them w…

TypeScript 4,627 411 Updated Nov 7, 2024

[INTERSPEECH'24] Official repository for "MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset"

Python 74 8 Updated Nov 5, 2024

The official implementation of RealisDance

C 221 13 Updated Nov 5, 2024
Jupyter Notebook 33 4 Updated Sep 11, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,524 168 Updated Sep 24, 2024

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,059 272 Updated Nov 5, 2024

Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice

Python 123 12 Updated Oct 12, 2024
Next