teaser.mp4
Our code is compatible and validate with Python 3.9.16, PyTorch 1.13.1, and CUDA 11.7.
conda create -n hashing-nvd python=3.9
conda activate hashing-nvd
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
conda install matplotlib tensorboard scipy scikit-image tqdm
pip install opencv-python imageio-ffmpeg gdown
CC=gcc-9 CXX=g++-9 python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
pip install easydict
CC=gcc-9 CXX=g++-9 pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
data
├── <video_name>
│ ├── video_frames
│ │ └── %05d.jpg or %05d.png ...
│ ├── flows
│ │ └── optical flow npy files ...
│ ├── masks
│ ├── <object_0>
│ │ └── %05d.png ...
│ ├── <object_1>
│ │ └── %05d.png ...
│ ⋮
│ └── <object_n>
│ └── %05d.png ...
⋮
The video frames follows the format of DAVIS dataset. The file type of images should be all either in png or jpg and named as 00000.jpg
, 00001.jpg
, ...
We extract the optical flow using RAFT. The submodule can be linked by the following command:
git submodule update --init
cd thirdparty/RAFT/
./download_models.sh
cd ../..
To create optical flow for the video, run:
python preprocess_optical_flow.py --data-path data/<video_name> --max_long_edge 768
The script will automatically generate the corresponding backward and forward optical flow and store the npy files in the right directory.
We extract the object masks using Mask-RCNN via the following script:
python preprocess_mask_rcnn.py --data-path data/<video_name> --class_name <class_name> --object_name <object_name>
The class_name
should be one of the COCO class name. It is also possible to use --class_name anything
to extract the first instance retrieved by Mask-RCNN.
The mask will be stored in data/<video_name>/masks/<object_name>
. Our implementation also supports decomposition of multiple objects.
To decompose a video, run:
python train.py config/config.py
You need to replace the data_folder
to the folder of your video.
It is also possible to test a certain checkpoint:
python test.py <config_file> <checkpoint_file>
The config file and checkpoint file will be stored to the assigned result folder.
Once the training is complete, the result of a checkpoint will be stored in <results_folder_name>/<video_name>_<folder_suffix>/<checkpoint_number>
. You can find checkpoint, reconstruction, PSNR report, and other edit videos for debug propose.
You can edit the tex%d.png
to edit the video. After that, run:
python edit.py <config_file> <checkpoint_file> <list of custom textures>
The edited video will be generated in the same folder and named as custom_edit.mp4
.
If you find our work useful in your research, please consider citing:
@InProceedings{Chan_2023_ICCV,
author = {Chan, Cheng-Hung and Yuan, Cheng-Yang and Sun, Cheng and Chen, Hwann-Tzong},
title = {Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2023},
pages = {7743-7753}
}