Skip to content
View Chengbin-Liang's full-sized avatar

Block or report Chengbin-Liang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

General Speech Restoration

Python 1,014 131 Updated May 31, 2024

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 35,602 4,185 Updated Aug 19, 2024

Instant voice cloning by MIT and MyShell.

Python 29,047 2,837 Updated Aug 21, 2024

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Jupyter Notebook 479 51 Updated Sep 11, 2023

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,590 755 Updated Feb 11, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 19,664 2,507 Updated Oct 7, 2024

Foundational model for human-like, expressive TTS

Python 3,781 650 Updated Jul 30, 2024

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 34,436 4,166 Updated Aug 16, 2024

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Python 52,334 8,752 Updated Aug 14, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 5,352 552 Updated Sep 29, 2024

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Jupyter Notebook 5,058 1,378 Updated Jun 12, 2024

Robust Speech Recognition via Large-Scale Weak Supervision

Python 69,057 8,125 Updated Sep 30, 2024

语音识别理论、论文和PPT

581 183 Updated Aug 7, 2024

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

Python 7,776 1,890 Updated Sep 26, 2024

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 10,823 1,055 Updated Aug 15, 2024

Tracking the progress in end-to-end speech translation

252 27 Updated Oct 25, 2023

List of direct speech-to-speech translation papers.

29 2 Updated Jan 31, 2023

vits2 backbone with multilingual-bert

Python 7,870 1,117 Updated Oct 7, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 33,731 3,866 Updated Oct 2, 2024

Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons

Python 1,051 162 Updated Aug 17, 2024

PyTorch Implementation of TranSpeech (ICLR'23): Textless NAR Speech-to-Speech Translation with Bilateral Perturbation

Python 170 23 Updated Jun 20, 2024

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Python 888 66 Updated Aug 24, 2024

🌏 Review notes for Postgraduate Interview of Tsinghua EE. (Sept. 2017)

197 30 Updated Apr 17, 2018

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,451 304 Updated Jan 4, 2024

Efficient neural speech synthesis

C 1,133 295 Updated Sep 21, 2024
Jupyter Notebook 4 1 Updated Nov 18, 2021

Recurrent neural network for audio noise reduction

C 4,024 890 Updated Aug 24, 2024

Conformer-based Metric GAN for speech enhancement

Python 300 60 Updated May 3, 2024

Implementation of paper "DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement"

Python 182 41 Updated Apr 22, 2024
Next