- Shanghai, China
- https://ymzhang0319.github.io/
Highlights
- Pro
Block or Report
Block or report ymzhang0319
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
A timeline of the latest AI models for audio generation, starting in 2023!
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Little tool in python to watch and download anime from the terminal (the better way to watch anime). Also applicable as an API
This toolbox aims to unify audio generation model evaluation for easier comparison.
Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.
Elucidating the Design Space of Diffusion-Based Generative Models (EDM)
Polyffusion: A Diffusion Model for Polyphonic Score Generation with Internal and External Controls
[ICML 2024] Code for the paper "MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts"
[ECCV 2024] AnyControl, a multi-control image synthesis model that supports any combination of user provided control signals. 一个支持用户自由输入控制信号的图像生成模型,能够根据多种控制生成自然和谐的结果!
Official code for paper: Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
StyleShot: A SnapShot on Any Style. 一款可以迁移任意风格到任意内容的模型,无需针对图片微调,即能生成高质量的个性风格化图片!
Analyzing and Improving the Training Dynamics of Diffusion Models (EDM2)
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
[ECCV 2024] PowerPaint, a versatile image inpainting model that supports text-guided object inpainting, object removal, image outpainting and shape-guided object inpainting with only a single model…
code for AAAI accepted paper Similarity Distribution based Membership Inference Attack on Person Re-Identification.
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝
哔哩下载姬downkyi,哔哩哔哩网站视频下载工具,支持批量下载,支持8K、HDR、杜比视界,提供工具箱(音视频提取、去水印等)。
The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"
AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation
a research paper for generative cartoon interpolation
ChordNova is a powerful open-source chord progression analysis plus generation software with unprecedentedly detailed control over chord trait parameters, that is way above mainstream softwares. Ru…
Synthesis of MIDI with DDSP (https://midi-ddsp.github.io/)
Sound2Synth: Interpreting Sound via FM Synthesizer Parameters Estimation