Skip to content
View postmelone's full-sized avatar

Block or report postmelone

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

从零手搓Flow Matching(Rectified Flow)

Python 55 2 Updated Sep 10, 2024

The Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) l…

HTML 475 145 Updated Jul 1, 2024

PyTorch implementation of Glow

Python 508 97 Updated Nov 20, 2021

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Python 73 3 Updated Sep 2, 2024

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 610 76 Updated Sep 2, 2024

Command line utility for forced alignment using Kaldi

Python 1,293 243 Updated Jul 16, 2024

Audio Codec Benchmark

Python 3 Updated Jun 11, 2024

Fake speech detection with the CodecFake dataset

Python 4 Updated Jul 27, 2024

SincNet is a neural architecture for efficiently processing raw audio samples.

Python 1,123 261 Updated Apr 28, 2021

A list of tools, papers and code related to Fake Audio Detection.

14 Updated Mar 20, 2024

Fake Lossless Audio Detector

Python 35 1 Updated Dec 7, 2022

This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the …

Python 750 119 Updated Sep 9, 2024

Implementation of the paper "Improved DeepFake Detection Using Whisper Features"

Python 81 5 Updated Apr 28, 2024

This repository includes the code to reproduce our paper "End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection" (https://arxiv.o…

Python 70 18 Updated Sep 17, 2023
Python 8 Updated Aug 23, 2023

Research progress on speech deepfake detection: Relevant datasets aggregated from the review literature and publicly available codes

82 8 Updated Jul 24, 2023

[InterSpeech'2023] "Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion"

Python 4 Updated Jan 25, 2024

pytorch structural similarity (SSIM) loss

Python 1,871 364 Updated Feb 22, 2024

Vector Quantized VAEs - PyTorch Implementation

Python 824 135 Updated Jul 12, 2023

Vector (and Scalar) Quantization, in Pytorch

Python 2,396 196 Updated Sep 4, 2024

Brand new TTS solution

Python 10,028 783 Updated Sep 13, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 4,736 477 Updated Sep 6, 2024

singing voice change based on whisper, and lora for singing voice clone

Python 616 77 Updated Nov 3, 2023

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 32,494 3,743 Updated Sep 13, 2024

Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

Python 1,164 76 Updated Sep 14, 2024

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech

Python 483 78 Updated Dec 28, 2023

Speech Recognition using DeepSpeech2.

Python 2,099 621 Updated Dec 13, 2022

List of speech synthesis papers.

989 120 Updated Jul 24, 2023
Next