Stars
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
📹 A more flexible CogVideoX that can generate videos at any resolution and creates videos from images.
Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
EventHallusion: Diagnosing Event Hallucinations in Video LLMs
Code for Paper "UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation".
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Writing AI Conference Papers: A Handbook for Beginners
[AAAI2022] Code Release of Attacking Video Recognition Models with Bullet-Screen Comments
[ACM MM2023] Code Release of GCMA: Generative Cross-Modal Transferable Adversarial Attacks from Images to Videos
A prize winning solution for Multimedia Deepfake Detection competition.
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
This is a weather software for Windows11+.
电视剧/番剧自动化重命名工具, 一键批量改名. 可配合QBittorrent下载后自动重命名, 方便Emby自动刮削. 支持Windows, Linux, MacOS, Docker 和 群晖套件环境运行
😊这是一个番剧自动识别剧名剧集+自动重命名+自动整理工具,用来配合QBittorrent实现Rss订阅下载全自动刮削一条龙到家式爽歪歪服务!
[ACM MM 2024] ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack
[ECCV 2022] MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes official implementation
[CVPR 2023] MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection
[NeurIPS 2024] Lumen: a Large multimodal model with versatile vision-centric capabilities
[CVPR 2024] Official implementation of CVPR 2024 paper: "Doubly Abductive Counterfactual Inference for Text-based Image Editing"
[MM24 Oral] Identity-Driven Multimedia Forgery Detection via Reference Assistance
一个用于在 macOS 上平滑你的鼠标滚动效果或单独设置滚动方向的小工具, 让你的滚轮爽如触控板 | A lightweight tool used to smooth scrolling and set scroll direction independently for your mouse on macOS
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
一个为Emby、Jellyfin服务器提供Strm直链播放的小项目,推荐配合MediaWarp使用
The official repository of ECCV 2024 paper "Outlier-Aware Test-time Adaptation with Stable Memory Replay"
one for all, Optimal generator with No Exception
Stable Video Diffusion Training Code and Extensions.