Skip to content
View Jackson-Kang's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@HGU-DLLAB @Handong-Global-Univ

Block or report Jackson-Kang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995

Python 46 4 Updated Nov 11, 2024

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Python 286 21 Updated Nov 15, 2024

State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning

Python 608 66 Updated Nov 15, 2024

ICASSP 2023 Accepted

Python 190 14 Updated May 6, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 12,352 1,135 Updated Oct 14, 2024

Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2

Jupyter Notebook 1,102 103 Updated Nov 3, 2024

goyulmusicacademy

HTML 1 Updated Aug 9, 2024

A collection of neural vocoders suitable for singing voice synthesis tasks.

Python 101 9 Updated Sep 10, 2024

ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec

Python 196 7 Updated Sep 3, 2024

A summary of related works about flow matching, stochastic interpolants

335 10 Updated Jul 29, 2024

UNet diffusion model in pure CUDA

Cuda 584 28 Updated Jun 28, 2024

Official Demo Page for DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer

HTML 30 1 Updated Aug 21, 2024

Codebase for benchmarking several open-sourced SpeechLLM models

4 Updated Jun 2, 2024

LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

114 2 Updated Jun 13, 2024

The official implementation of our paper "Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning".

Python 71 3 Updated Sep 2, 2024

PyTorch implementation of normalizing flow models

Python 718 108 Updated Aug 25, 2024

High-Fidelity Neural Phonetic Posteriorgrams

Python 95 6 Updated Nov 6, 2024

An unofficial PyTorch implementation of the StreamVC(Real-Time Low-Latency Voice Conversion)

Python 111 7 Updated Jul 30, 2024

A comprehensive collection of KAN(Kolmogorov-Arnold Network)-related resources, including libraries, projects, tutorials, papers, and more, for researchers and developers in the Kolmogorov-Arnold N…

2,562 234 Updated Nov 6, 2024

Kolmogorov Arnold Networks

Jupyter Notebook 15,059 1,391 Updated Nov 14, 2024
Python 98 23 Updated Sep 2, 2021

An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).

Python 4,083 359 Updated Aug 1, 2024
Python 51 3 Updated Oct 19, 2024

Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch

Python 252 5 Updated Aug 24, 2024

Full models and training code for PESTO

Python 51 14 Updated Jun 12, 2024

TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models (2024 ICASSP)

Python 140 5 Updated Aug 29, 2024

The original sources of MS-DOS 1.25, 2.0, and 4.0 for reference purposes

Assembly 30,754 4,388 Updated Apr 25, 2024

The Emotional Voices Database: Towards Controlling the Emotional Expressiveness in Voice Generation Systems

Python 254 19 Updated Oct 10, 2023

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation

Python 628 42 Updated Oct 1, 2024
Next