Video-Infinity

Video-Infinity: Distributed Long Video Generation
Zhenxiong Tan, Xingyi Yang, Songhua Liu, and Xinchao Wang
Learning and Vision Lab, National University of Singapore

TL;DR (Too Long; Didn't Read)

Video-Infinity generates long videos quickly using multiple GPUs without extra training. Feel free to visit our project page for more information and generated videos.

Features

Distributed 🌐: Utilizes multiple GPUs to generate long-form videos.
High-Speed 🚀: Produces 2,300 frames in just 5 minutes.
Training-Free 🎓: Generates long videos without requiring additional training for existing models.

Setup

Installation Environment

conda create -n video_infinity_vc2 python=3.10
conda activate video_infinity_vc2
pip install -r requirements.txt

Usage

Quick Start

Basic Usage

python inference.py --config examples/config.json

Multi-Prompts

python inference.py --config examples/multi_prompts.json

Single GPU

python inference.py --config examples/single_gpu.json

Config

Basic Config

Parameter	Description
`devices`	The list of GPU devices to use.
`base_path`	The path to save the generated videos.

Pipeline Config

Parameter	Description
`prompts`	The list of text prompts. Note: The number of prompts should be greater than the number of GPUs.
`file_name`	The name of the generated video.
`num_frames`	The number of frames to generate on each GPU.

Video-Infinity Config

Parameter	Description
`*.padding`	The number of local context frames.
`attn.topk`	The number of global context frames for `Attention` model.
`attn.local_phase`	When the denoise timestep is less than `t`, it bias the attention. This adds a `local_bias` to the local context frames and a `global_bias` to the global context frames.
`attn.global_phase`	It is similar to `local_phase`. But it bias the attention when the denoise timestep is greater than `t`.
`attn.token_num_scale`	If the value is `True`, the scale factor will be rescaled by the number of tokens. Default is `False`. More details can be referred to this paper.

How to Set Config

To avoid the loss of high-frequency information, we recommend setting the sum of padding and attn.topk to be less than 24 (which is similar to the number of the default frames in the VideoCrafter2 model).
- If you wish to have a larger padding or attn.topk, you should set the attn.token_num_scale to True.
A higher local_phase.t and global_phase.t will result in more stable videos but may reduce the diversity of the videos.
More padding will provide more local context.
A higher attn.topk will bring about overall stability in the videos.

Citation

@article{
  tan2024videoinf,
  title={Video-Infinity: Distributed Long Video Generation},
  author={Zhenxiong Tan, Xingyi Yang, Songhua Liu, and Xinchao Wang},
  journal={arXiv preprint arXiv:2406.16260},
  year={2024}
}

Acknowledgements

Our project is based on the VideoCrafter2 model. We would like to thank the authors for their excellent work! ❤️

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
assets		assets
examples		examples
src		src
README.md		README.md
inference.py		inference.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video-Infinity

TL;DR (Too Long; Didn't Read)

Features

Setup

Installation Environment

Usage

Quick Start

Config

Basic Config

Pipeline Config

Video-Infinity Config

How to Set Config

Citation

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

Yuanshi9815/Video-Infinity

Folders and files

Latest commit

History

Repository files navigation

Video-Infinity

TL;DR (Too Long; Didn't Read)

Features

Setup

Installation Environment

Usage

Quick Start

Config

Basic Config

Pipeline Config

Video-Infinity Config

How to Set Config

Citation

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages