ymzhang0319

🌴

On vacation

Yiming ymzhang0319

🌴

On vacation

PhD. candidate jointly at USTC and Shanghai AI Laboratory.

26 followers · 42 following

Achievements

Highlights

Block or Report

Block or report ymzhang0319

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

160 results for source starred repositories

Clear filter

open-mmlab / Live2Diff

Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.

38 2 Updated Jul 12, 2024

NVlabs / edm

Elucidating the Design Space of Diffusion-Based Generative Models (EDM)

Python 1,200 125 Updated Mar 16, 2024

aik2mlj / polyffusion

Polyffusion: A Diffusion Model for Polyphonic Score Generation with Internal and External Controls

Python 65 7 Updated Jun 15, 2024

UNITES-Lab / MoE-RBench

[ICML 2024 Poster] Code for the paper "MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts"

Python 5 Updated Jul 1, 2024

kwatcharasupat / bandit

BandIt: Cinematic Audio Source Separation

Python 67 3 Updated Jul 5, 2024

open-mmlab / AnyControl

[ECCV 2024] AnyControl, a multi-control image synthesis model that supports any combination of user provided control signals. 一个支持用户自由输入控制信号的图像生成模型，能够根据多种控制生成自然和谐的结果！

Python 73 Updated Jul 5, 2024

apple / ml-mgie

Python 3,804 249 Updated Mar 15, 2024

yichengchen24 / ACP

Official code for paper: Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language

Python 17 Updated Jul 1, 2024

JacobChalk / TIM

Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"

Python 25 3 Updated Jul 11, 2024

YuanGongND / cav-mae

Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".

Python 214 20 Updated Mar 20, 2024

open-mmlab / StyleShot

StyleShot: A SnapShot on Any Style. 一款可以迁移任意风格到任意内容的模型，无需针对图片微调，即能生成高质量的个性风格化图片!

Python 81 3 Updated Jul 5, 2024

NVlabs / edm2

Analyzing and Improving the Training Dynamics of Diffusion Models (EDM2)

Python 417 14 Updated May 30, 2024

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 29,087 3,363 Updated Jul 13, 2024

open-mmlab / PowerPaint

[ECCV 2024] PowerPaint, a versatile image inpainting model that supports text-guided object inpainting, object removal, image outpainting and shape-guided object inpainting with only a single model…

Python 387 21 Updated Jul 5, 2024

Vill-Lab / 2023-AAAI-SDMIA

code for AAAI accepted paper Similarity Distribution based Membership Inference Attack on Person Re-Identification.

Python 10 1 Updated Nov 6, 2023

open-mmlab / FoleyCrafter

FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师，给你的无声视频添加生动而且同步的音效 😝

Python 220 10 Updated Jul 11, 2024

camenduru / FoleyCrafter-jupyter

Jupyter Notebook 8 Updated Jun 28, 2024

leiurayer / downkyi

哔哩下载姬downkyi，哔哩哔哩网站视频下载工具，支持批量下载，支持8K、HDR、杜比视界，提供工具箱（音视频提取、去水印等）。

C# 19,934 2,206 Updated Feb 8, 2024

jianzongwu / MotionBooth

The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"

Python 68 4 Updated Jul 9, 2024

donahowe / AutoStudio

AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation

Jupyter Notebook 327 24 Updated Jul 8, 2024

ToonCrafter / ToonCrafter

a research paper for generative cartoon interpolation

Python 4,820 405 Updated Jun 1, 2024

Chen-and-Sim / ChordNova

ChordNova is a powerful open-source chord progression analysis plus generation software with unprecedentedly detailed control over chord trait parameters, that is way above mainstream softwares. Ru…

C++ 708 80 Updated Jan 10, 2024

Sound2Synth / Sound2Synth

Sound2Synth: Interpreting Sound via FM Synthesizer Parameters Estimation

Python 72 11 Updated Jul 28, 2022

zengyh1900 / handy_voting

handy tools for user study

CSS 19 Updated May 21, 2024

Birch-san / imagebind-guided-diffusion

Guide diffusion on ImageBind embedding similarity

Python 26 1 Updated May 27, 2023

Alpha-VLLM / Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 1,883 80 Updated Jul 11, 2024

ZijiaLewisLu / CVPR2024-FACT

Official Repo for CVPR 2024 Paper "FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Fully-Supervised Action Segmentation"

Python 23 3 Updated Jun 24, 2024

jhao104 / proxy_pool

Python ProxyPool for web spider

Python 20,934 5,081 Updated Jun 17, 2024

gligen / GLIGEN

Open-Set Grounded Text-to-Image Generation

Python 1,913 145 Updated Mar 6, 2024

openai / whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Python 64,506 7,518 Updated Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yiming ymzhang0319

Achievements

Achievements

Highlights

Block or report ymzhang0319

Stars

open-mmlab / Live2Diff

NVlabs / edm

aik2mlj / polyffusion

UNITES-Lab / MoE-RBench

kwatcharasupat / bandit

open-mmlab / AnyControl

apple / ml-mgie

yichengchen24 / ACP

JacobChalk / TIM

YuanGongND / cav-mae

open-mmlab / StyleShot

NVlabs / edm2

RVC-Boss / GPT-SoVITS

open-mmlab / PowerPaint

Vill-Lab / 2023-AAAI-SDMIA

open-mmlab / FoleyCrafter

camenduru / FoleyCrafter-jupyter

leiurayer / downkyi

jianzongwu / MotionBooth

donahowe / AutoStudio

ToonCrafter / ToonCrafter

Chen-and-Sim / ChordNova

Sound2Synth / Sound2Synth

zengyh1900 / handy_voting

Birch-san / imagebind-guided-diffusion

Alpha-VLLM / Lumina-T2X

ZijiaLewisLu / CVPR2024-FACT

jhao104 / proxy_pool

gligen / GLIGEN

openai / whisper