# BOAT: Bilateral Local Attention Vision Transformer


This is an unofficial implementation of the paper BOAT: Bilateral Local Attention Vision Transformer. 

The [Swin variant](https://github.com/mahaoyuHKU/pytorch-boat/tree/main/Swin) is based on [Swin Transformer](https://github.com/microsoft/Swin-Transformer)

The [CSwin variant](https://github.com/mahaoyuHKU/pytorch-boat/tree/main/CSwin) is based on [CSwin Tranformer](https://github.com/microsoft/CSWin-Transformer)

Please check corresponding folders for more installation, training and evaluation instructions.

# Pre-trained models

[BOAT-Swin-Tiny](https://www.dropbox.com/s/xa94uewsrvjglnn/tiny.pth?dl=0)

[BOAT-Swin-Small](https://www.dropbox.com/s/7ih1zvii3bvdcgd/small.pth?dl=0)

[BOAT-Swin-Base](https://www.dropbox.com/s/70hr7h0smcr0gr9/base.pth?dl=0)

[BOAT-CSwin-Tiny](https://www.dropbox.com/s/rsmtu6r0v2lt0y5/cswin_tiny.pth.tar?dl=0)

[BOAT-CSwin-Small](https://www.dropbox.com/s/cnl00d1faxxoi19/cswin_small.pth.tar?dl=0)

[BOAT-CSwin-Base](https://www.dropbox.com/s/92sr8r8zhng1mqg/cswin_base.pth.tar?dl=0)

## Acknowledgement
This is developped based on CSWin Transformer and Swin-transformer


# If you use this code for your research, please consider citing:

```bash
@article{BOAT,
  author    = {Tan Yu and Gangming Zhao and Ping Li and Yizhou Yu},
  title     = {{BOAT:} Bilateral Local Attention Vision Transformer},
  journal   = {CoRR},
  volume    = {abs/2201.13027},
  year      = {2022},
  url       = {https://arxiv.org/abs/2201.13027},
}
```