-
Amazon Alexa AI.
- San Jose
Block or Report
Block or report jacobswan1
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models
[CVPR2024] VideoBooth: Diffusion-based Video Generation with Image Prompts
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Experiment on combining CLIP with SAM to do open-vocabulary image segmentation.
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation (TMLR 2024)
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥
Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment
CoTracker is a model for tracking any point (pixel) on a video.
Mora: More like Sora for Generalist Video Generation
Open-Sora: Democratizing Efficient Video Production for All
[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Official Code for MotionCtrl [SIGGRAPH 2024]
Official implementation of ICCV2023 VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation
[AAAI 2024] Follow-Your-Pose: This repo is the official implementation of "Follow-Your-Pose : Pose-Guided Text-to-Video Generation using Pose-Free Videos"
Nightly release of ControlNet 1.1
Official codebase for the Paper “Retrieval-Augmented Diffusion Models”