STDiff: Spatio-temporal diffusion for continuous stochastic video prediction

Overview

Installation

Install the custom diffusers library

git clone https://github.com/XiYe20/CustomDiffusers.git
cd CustomDiffusers
pip install -e .

Install the requirements of STDiff

pip install -r requirements.txt

Datasets

Processed KTH dataset: https://drive.google.com/file/d/1RbJyGrYdIp4ROy8r0M-lLAbAMxTRQ-sd/view?usp=sharing
SM-MNIST: https://drive.google.com/file/d/1eSpXRojBjvE4WoIgeplUznFyRyI3X64w/view?usp=drive_link

For other datasets, please download them from the official website. Here we show the dataset folder structure.

BAIR

Please download the original BAIR dataset and utilize the "/utils/read_BAIR_tfrecords.py" script to convert it into frames as follows:

/BAIR
     test/
         example_0/
            0000.png
            0001.png
            ...
         example_1/
            0000.png
            0001.png
            ...
         example_...
     train/
         example_0/
            0000.png
            0001.png
            ...
         example_...

Cityscapes

Please download "leftImg8bit_sequence_trainvaltest.zip" from the official website. Center crop and resize all the frames to the size of 128X128. Save all the frames as follows:

/Cityscapes
     test/
         berlin/
            berlin_000000_000000_leftImg8bit.png
            berlin_000000_000001_leftImg8bit.png
            ...
         bielefeld/
            bielefeld_000000_000302_leftImg8bit.png
            bielefeld_000000_000302_leftImg8bit.png
            ...
         ...
     train/
         aachen/
            ....
         bochum/
            ....
         ...
     val/
            ....

KITTI

Please download the raw data (synced+rectified) from KITTI official website. Center crop and resize all the frames to the resolution of 128X128. Save all the frames as follows:

/KITTI
     2011_09_26_drive_0001_sync/
            0000000000.png
            0000000001.png
            ...
     2011_09_26_drive_0002_sync/
            ...
      ...

Training and Evaluation

The STDiff project uses accelerate for training. The training configuration files and test configuration files for different datasets are placed inside stdiff/configs.

Training

Check train_script.sh, modify the visible gpus, num_process, select the correct train_cofig file
Training

. ./train_script.sh

Test

Check test_script.sh, select the correct test_cofig file
Test

. ./test_script.sh

Citation

@inproceedings{ye2024stdiff,
  title={STDiff: Spatio-Temporal Diffusion for Continuous Stochastic Video Prediction},
  author={Ye, Xi and Bilodeau, Guillaume-Alexandre},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={7},
  pages={6666--6674},
  year={2024}
}

Uncurated prediction examples of STDiff for multiple datasets.

The temporal coordinates are shown at the top left corner of the frame. Frames with Red temporal coordinates denote future frames predicted by our model.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
documentations		documentations
stdiff		stdiff
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
test_scipt.sh		test_scipt.sh
train_script.sh		train_script.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STDiff: Spatio-temporal diffusion for continuous stochastic video prediction

Overview

Installation

Datasets

BAIR

Cityscapes

KITTI

Training and Evaluation

Training

Test

Citation

Uncurated prediction examples of STDiff for multiple datasets.

BAIR

SMMNIST

KITTI

Cityscapes

About

Releases

Packages

Languages

XiYe20/STDiffProject

Folders and files

Latest commit

History

Repository files navigation

STDiff: Spatio-temporal diffusion for continuous stochastic video prediction

Overview

Installation

Datasets

BAIR

Cityscapes

KITTI

Training and Evaluation

Training

Test

Citation

Uncurated prediction examples of STDiff for multiple datasets.

BAIR

SMMNIST

KITTI

Cityscapes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages