Pytorch implementation of paper Grouped Spatial-Temporal Aggretation for Efficient Action Recognition. arxiv
- PyTorch 1.0 or higher
- python 3.5 or higher
Please refer to TRN-pytorch for data preparation on Something-Something.
- For GST-Large:
python3 main.py --root_path /path/to/video/folder --dataset somethingv1 --checkpoint_dir /path/for/saving/checkpoints/ --type GST --arch resnet50 --num_segments 8 --beta 1
- For GST:
python3 main.py --root_path /path/to/video/folder --dataset somethingv1 --checkpoint_dir /path/for/saving/checkpoints/ --type GST --arch resnet50 --num_segments 8 --beta 2 --alpha 4
- For more details, please type
python3 main.py -h
Something-v1 | Something-v2 | |
---|---|---|
GST(alpha=4, 8 frames) | 47.0 | 61.6 |
GST(alpha=4,16 frames) | 48.6 | 62.6 |
GST-Large(alpha=4,8 frames) | 47.7 | 62.0 |
- results are reported based on center crop and 1 clip sampling.
If you find our work useful in your research, please consider citing our paper
@inproceedings{luo2019grouped,
title={Grouped Spatial-Temporal Aggretation for Efficient Action Recognition},
author={Luo, Chenxu and Yuille, Alan},
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
year={2019}
}
or
@article{luo2019grouped,
title={Grouped Spatial-Temporal Aggregation for Efficient Action Recognition},
author={Luo, Chenxu and Yuille, Alan},
journal={arXiv preprint arXiv:1909.13130},
year={2019}
}
This codebase is build upon TRN-pytorch and TSN-pytorch