Skip to content

Ji4chenLi/t2v-turbo

Repository files navigation

T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback

🔔 News

[06.24.2024] Release the training codes for T2V-Turbo (VC2).

Fast and High-Quality Text-to-video Generation 🚀

Replicate

4-Step Results

With the style of low-poly game art, A majestic, white horse gallops gracefully across a moonlit beach. medium shot of Christine, a beautiful 25-year-old brunette resembling Selena Gomez, anxiously looking up as she walks down a New York street, cinematic style a cartoon pig playing his guitar, Andrew Warhol style
a dog wearing vr goggles on a boat Pikachu snowboarding a girl floating underwater

8-Step Results

Mickey Mouse is dancing on white background light wind, feathers moving, she moves her gaze, 4k fashion portrait shoot of a girl in colorful glasses, a breeze moves her hair
With the style of abstract cubism, The flowers swayed in the gentle breeze, releasing their sweet fragrance. impressionist style, a yellow rubber duck floating on the wave on the sunset A Egyptian tomp hieroglyphics painting ofA regal lion, decked out in a jeweled crown, surveys his kingdom.

🏭 Installation

pip install accelerate transformers diffusers webdataset loralib peft pytorch_lightning open_clip_torch hpsv2 image-reward peft wandb av einops packaging omegaconf opencv-python kornia moviepy imageio

pip install flash-attn --no-build-isolation
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention
pip install csrc/fused_dense_lib csrc/layer_norm

pip install git+https://github.com/iejMac/video2dataset.git

conda install xformers

🛞 Model Checkpoints

Model Resolution Checkpoints
T2V-Turbo (VC2) 320x512 HuggingFace
T2V-Turbo (MS) 256x256 HuggingFace

🚀 Inference

We provide local demo codes supported with gradio (For MacOS users, need to set the device="mps" in app.py; For Intel GPU users, set device="xpu" in app.py).

To play with our T2V-Turbo (VC2), please follow the steps below:

  1. Download the unet_lora.pt of our T2V-Turbo (VC2) here.

  2. Download the model checkpoint of VideoCrafter2 here.

  3. Launch the gradio demo with the following command:

pip install gradio==3.48.0
python app.py --unet_dir PATH_TO_UNET_LORA.pt --base_model_dir PATH_TO_VideoCrafter2_MODEL_CKPT

To play with our T2V-Turbo (MS), please follow the steps below:

  1. Download the unet_lora.pt of our T2V-Turbo (MS) here.

  2. Launch the gradio demo with the following command:

pip install gradio==3.48.0
python app_ms.py --unet_dir PATH_TO_UNET_LORA.pt

🏋️ Training

To train T2V-Turbo (VC2), first prepare the data and model as below

  1. Download the model checkpoint of VideoCrafter2 here.
  2. Prepare the WebVid-10M data. Save in the webdataset format.
  3. Download the InternVid2 S2 Model
  4. Set --pretrained_model_path, --train_shards_path_or_url and video_rm_ckpt_dir accordingly in train_t2v_turbo_vc2.sh.

Then run the following command:

bash train_t2v_turbo_vc2.sh

To train T2V-Turbo (MS), run the following command

bash train_t2v_turbo_ms.sh