Skip to content
View changjinhan's full-sized avatar
🏢
🏢

Highlights

  • Pro

Block or report changjinhan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 6,723 861 Updated Aug 27, 2024

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 649 29 Updated Aug 20, 2024

Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 6,069 551 Updated Aug 28, 2024

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Python 2,195 479 Updated Aug 28, 2024

Python logging made (stupidly) simple

Python 19,305 689 Updated Aug 2, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 10,059 755 Updated Aug 21, 2024

This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.

Python 1,048 406 Updated Jul 25, 2024

Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch

Python 85 1 Updated Aug 26, 2024

Audio Codec Speech processing Universal PERformance Benchmark

Python 199 22 Updated Jun 19, 2024

SpeechFlow neural network implementation

Jupyter Notebook 15 Updated Aug 8, 2024

Supervoice diffusion enhance

Jupyter Notebook 24 Updated Jul 15, 2024

My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one

Jupyter Notebook 27 2 Updated Aug 5, 2024

👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing

Python 905 76 Updated Aug 28, 2024

Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate

Python 266 18 Updated Apr 9, 2024
Python 38 2 Updated Jul 11, 2024

DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability

Python 77 6 Updated Jul 10, 2024

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

Python 214 21 Updated Aug 27, 2024

VALL-E 2 reproduction

Jupyter Notebook 68 11 Updated Jul 14, 2024

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 7,948 1,437 Updated Aug 28, 2024

A Very Low-Bitrate Codec for Speech Compression

C++ 3,812 355 Updated Aug 20, 2024

Perceptual Quality Estimator for speech and audio

C++ 668 122 Updated Aug 2, 2024

Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind

Python 104 1 Updated Aug 23, 2024
Python 148 24 Updated May 24, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 4,254 411 Updated Aug 28, 2024

Bring portraits to life!

Python 10,888 1,089 Updated Aug 26, 2024

Tools for handling speech data in machine learning projects.

Python 922 210 Updated Aug 22, 2024
Python 876 285 Updated Aug 28, 2024

This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdf

Python 339 52 Updated Apr 21, 2022

Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)

Python 124 11 Updated Sep 14, 2023

High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance

Python 1,459 114 Updated Jul 17, 2024
Next