For more details, please checkout our paper: Mesa: A Memory-saving Training Framework for Transformers.
- Install Mesa from here.
- Install dependencies of Swin from here.
- To train a Swin model.
conda activate mesa # Make sure you have the correct path to the dataset in the config file. python -m torch.distributed.launch --nproc_per_node [GPUs] --master_port 13235 main.py \ --cfg [configs] # For example, to train Swin-Ti with Mesa on 1 GPU. python -m torch.distributed.launch --nproc_per_node 1 --master_port 13235 main.py \ --cfg configs/swin_tiny_patch4_window7_224_mesa.yaml
Model | Param (M) | FLOPs (G) | Train Memory | Top-1 (%) |
---|---|---|---|---|
Swin-Ti | 29 | 4.5 | 11,812 | 81.3 |
Swin-Ti w/ Mesa | 29 | 4.5 | 5,371 | 81.3 |
This repository has adopted codes from Swin Transformer, we thank the authors for their open-sourced code.