Skip to content

DavidZhang73/AssemblyVideoManualAlignment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Aligning Step-by-Step Instructional Diagrams to Video Demonstrations

Pytorch Pytorch Lightning Pytorch Lightning Template Conference ArXiv Project Website Dataset Website License

Description

Official PyTorch implementation of CVPR 2023 Aligning Step-by-Step Instructional Diagrams to Video Demonstrations.

How to run

Data Preparation

  1. Download the dataset in JSON format from here.
  2. Follow the instructions to download the data.
  3. Resize the short side of both page and step images to 224px.
  4. Use the script script/gen_image_pickle.py to generate the image pickle files.
  5. Resize the short side of the videos to 224px.
  6. Following the split files to split the video into 10-second long clips and store the frames in numpy format.

Installation

# clone project
git clone https://github.com/DavidZhang73/AssemblyVideoManualAlignment.git

# [Optional] create conda virtual environment
conda create -n <env_name> python=<3.8|3.9|3.10>
conda activate <env_name>

# [Optional] use mamba instead of conda
conda install mamba -n base -c conda-forge

# [Optional] install pytorch according to the official guide to support GPU acceleration, etc.
# https://pytorch.org/get-started/locally/

# install requirements
pip install -r requirements.txt

Train

python src/main.py fit -c configs/exp/ours.yaml -c configs/exp/{exp_name}.yaml --trainer.logger.name {log_name}

Inference

python src/main.py test -c configs/exp/ours.yaml -c configs/exp/{exp_name}.yaml --trainer.logger.name {log_name}

Citation

@inproceedings{Zhang2023Aligning,
  author    = {Zhang, Jiahao and Cherian, Anoop and Liu, Yanbin and Ben-Shabat, Yizhak and Rodriguez, Cristian and Gould, Stephen},
  title     = {Aligning Step-by-Step Instructional Diagrams to Video Demonstrations},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2023},
}

About

Official PyTorch implementation for CVPR 2023 Aligning Instructional Videos to Step by Step Illustrations

Resources

License

Stars

Watchers

Forks