Skip to content

[CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference

License

Notifications You must be signed in to change notification settings

GATECH-EIC/Castling-ViT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference

License: Apache 2.0

Haoran You, Yunyang Xiong, Xiaoliang Dai, Bichen Wu, Peizhao Zhang, Haoqi Fan, Peter Vajda, Yingyan Lin

Accepted by CVPR 2023. More Info: [ Paper | Slide | Youtube | Poster | Github ]


This is supposed to be an unofficial release of miniature code to reveal the core implementation of our attention block. The final adopted attention block is in a MultiScaleAttention format.

python attention.py

Here are some general guidances for reproducing results reported in our paper.

  • For classification task, we build our codebase on top of MobileVision@Meta.

  • For segmentation task, we build our codebase on top of Mask2Former, where the unsupervised pretrained models are trained using the MAE framework.

  • For detection task, we build our codebase on top of PicoDet@PaddleDet and its PyTorch version. The supervised pretrained models are trained using the LeViT framework.

To facilitate the usage in our research community, I am working on translating some of the highly coupled codes to standalone version. Ideally, the detection codebase can be exptected later, stay tuned.


Citation

If you find this codebase is useful for your research, please cite:

@inproceedings{you2023castling,
  title={Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference},
  author={You, Haoran and Xiong, Yunyang and Dai, Xiaoliang and Wu, Bichen and Zhang, Peizhao and Fan, Haoqi and Vajda, Peter and Lin, Yingyan},
  booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023)},
  year={2023}
}

About

[CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages