This is the code implementation for the paper "Patch-level Gaze Distribution Prediction for Gaze Following" published in WACV 2023 (paper).
python=3.7.13
pytorch=1.10.0
cudatoolkit=11.3.1
Download the images and annotations for GazeFollow and VideoAttentionTarget.
Download the depth maps for GazeFollow and VideoAttentionTarget datasets here.
Download the initial weights for training on GazeFollow and VideoAttentionTarget datasets here.
Modify config_pdp.yaml with your datasets and depth map directories accordingly.
For training on gazefollow dataset, run:
python train_gazefollow_patch.py --init_weights {initial_weights_for_spatial_training.pt}
For training the model without using depth images:
python train_gazefollow_patch.py --init_weights {initial_weights_for_spatial_training.pt} --not_use_depth --lambda_ 40
For VideoAttentionTarget, we split the dataset into 5-frame sequences in training and test sets and stored the splits here. For training on VideoAttenionTarget dataset, run:
python train_videoatt_patch.py --init_weights {initial_weights_for_temporal_training.pt}
for training the model without using depth images:
python train_videoatt_patch.py --init_weights {initial_weights_for_temporal_training_nodepth.pt} --not_use_depth
Here we provide the pretrained model weights on GazeFollow and VideoAttentionTarget datasets.
If you find our code useful, please consider citing our paper:
@inproceedings{miao2023patch,
title={Patch-level Gaze Distribution Prediction for Gaze Following},
author={Miao, Qiaomu and Hoai, Minh and Samaras, Dimitris},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
pages={880--889},
year={2023}
}