Skip to content

Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.

License

Notifications You must be signed in to change notification settings

open-mmlab/Live2Diff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models

Authors: Zhening Xing, Gereon Fox, Yanhong Zeng, Xingang Pan, Mohamed Elgharib, Christian Theobalt, Kai Chen † (†: corresponding author)

arXiv Project Page

Code will be release in one week, stay tuned!

Key Features

  • Uni-directional Temporal Attention with Warmup Mechanism
  • Multitimestep KV-Cache for Temporal Attention during Inference
  • Depth Prior for Better Structure Consistency
  • Compatible with DreamBooth and LoRA for Various Styles
  • TensorRT Supported

The speed evaluation is conducted on Ubuntu 20.04.6 LTS and Pytorch 2.2.2 with RTX 4090 GPU and Intel(R) Xeon(R) Platinum 8352V CPU. Denoising steps are set as 2.

Resolution TensorRT FPS
512 x 512 On 16.43
512 x 512 Off 6.91
768 x 512 On 12.15
768 x 512 Off 6.29

Real-Time Video2Video Demo

Human Face (Web Camera Input)

Anime Character (Screen Video Input)

online-demo.mp4
arknight-old-woman-v3.mp4

Acknowledgements

The video and image demos in this GitHub repository were generated using LCM-LoRA. Stream batch in StreamDiffusion is used for model acceleration. The design of Video Diffusion Model is adopted from AnimateDiff. We use a third-party implementation of MiDaS implementation which support onnx export. Our online demo is modified from Real-Time-Latent-Consistency-Model.

BibTex

If you find it helpful, please consider citing our work:

@article{xing2024live2diff,
  title={Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models},
  author={Zhening Xing and Gereon Fox and Yanhong Zeng and Xingang Pan and Mohamed Elgharib and Christian Theobalt and Kai Chen},
  booktitle={arXiv preprint arxiv:2407.08701},
  year={2024}
}