MXNetSeg

This project provides modular implementation for state-of-the-art semantic segmentation models based on the MXNet framework and GluonCV toolkit. See MindSeg for a mirror implemented by the HUAWEI MindSpore.

Bright Spots

Ease of use and extension pipeline for the semantic segmentation task, including data pre-processing, model definition, network training and evaluation.
Parallel training on GPUs.
Multiple supported models.
- Fully Convolutional Networks for Semantic Segmentation [FCN, CVPR2015, paper]
- Attention to Scale: Scale-Aware Semantic Image Segmentation [Att2Scale, CVPR2016, paper]
- Rethinking Atrous Convolution for Semantic Image Segmentation [DeepLabv3, arXiv2017, paper]
- Ladder-Style DenseNets for Semantic Segmentation of Large Natural Images [LadderDensenet, ICCVW2017, paper]
- Pyramid Scene Parsing Network [PSPNet, CVPR2017, paper]
- BiSeNet: Bilateral segmentation network for real-time semantic segmentation [BiSeNet, ECCV2018, paper]
- Encoder-decoder with atrous separable convolution for semantic image segmentation [DeepLabv3+, ECCV2018, paper]
- DenseASPP for Semantic Segmentation in Street Scenes [DenseASPP, CVPR2018, paper]
- Towards Bridging Semantic Gap to Improve Semantic Segmentation [SeENet, ICCV2019, paper]
- ACFNet: Attentional Class Feature Network for Semantic Segmentation [ACFNet, ICCV2019, paper]
- Dual Attention Network for Scene Segmentation [DANet, CVPR2019, paper]
- In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images [SwiftNet, CVPR2019, paper]
- Panoptic Feature Pyramid Networks [SemanticFPN, CVPR2019, paper]
- Gated Fully Fusion for Semantic Segmentation [GFFNet, AAAI2020, paper]
- Attention-guided Chained Context Aggregation for Semantic Segmentation [CANetv1, IMAVIS2021, paper]
- EPRNet: Efficient Pyramid Representation Network for Real-Time Street Scene Segmentation [EPRNet, TITS2021, paper]
- AttaNet: Attention-Augmented Network for Fast and Accurate Scene Parsing [AttaNet, AAAI2021, paper]
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale [ViT, ICLR2021, paper]
- Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers [SETR, CVPR2021, paper]
- FaPN: Feature-aligned Pyramid Network for Dense Image Prediction [FaPN, ICCV2021, paper]
- AlignSeg: Feature-Aligned Segmentation Networks [AlignSeg, TPAMI2021, paper]
- Compensating for Local Ambiguity with Encoder-Decoder in Urban Scene Segmentation [CANetv2, TITS2022, paper]

Benchmarks

We note that:

OS is output stride of the backbone network.
* denotes multi-scale and flipping testing, otherwise single-scale inputs.
No whistles and bells are adopted, e.g. OHEM or multi-grid.

Cityscapes

Model	Backbone	OS	#Params	TrainSet	EvalSet	mIoU	*mIoU
BiSeNet	ResNet18	32	13.2M	train_fine	val	71.6	74.7
BiSeNet	ResNet18	32	13.2M	trainval_fine	test	-	74.8
FCN	ResNet18	32	12.4M	train_fine	val	64.9	68.1
FCN	ResNet18	8	12.4M	train_fine	val	68.3	69.9
FCN	ResNet50	8	28.4M	train_fine	val	71.7	-
FCN	ResNet101	8	47.5M	train_fine	val	74.5	-
PSPNet	ResNet101	8	56.4M	train_fine	val	78.2	79.5
DeepLabv3	ResNet101	8	58.9M	train_fine	val	79.3	80.0
DenseASPP	ResNet101	8	69.4M	train_fine	val	78.7	79.8
DANet	ResNet101	8	66.7M	train_fine	val	79.7	80.9

ADE20K

Model	Backbone	OS	TrainSet	EvalSet	PA	mIoU	*PA	*mIoU
PSPNet	ResNet101	8	train	val	80.1	42.9	80.9	43.7

Pascal VOC 2012

Model	Backbone	OS	TrainSet	EvalSet	PA	mIoU	*PA	*mIoU
FCN	ResNet101	8	train_aug	val	94.4	74.6	94.5	75.0
Att2Scale	ResNet101	8	train_aug	val	94.8	77.1	-	-
PSPNet	ResNet101	8	train_aug	val	95.1	78.1	95.3	78.5
DeepLabv3	ResNet101	8	train_aug	val	95.5	80.1	95.6	80.4
DeepLabv3+	ResNet101	8	train_aug	val	95.5	79.9	95.6	80.1

NYUv2

Model	Backbone	OS	TrainSet	EvalSet	PA	mIoU	*PA	*mIoU
FCN	ResNet101	8	train	val	69.2	39.7	70.2	41.0
PSPNet	ResNet101	8	train	val	71.3	43.0	71.9	43.6
DeepLabv3+	ResNet101	8	train	val	73.5	46.0	74.3	47.2

Environment

We adopt python 3.6.2 and CUDA 10.1 in this project.

Prerequisites
```
pip install -r requirements.txt
```
Note that we employ wandb for log and visualization. Refer to here for a QuickStart.
Detail API for Pascal Context dataset

Usage

Training

Configure hyper-parameters in ./mxnetseg/config.yml

Run the ./mxnetseg/train.py script

python train.py --ctx 0 1 2 3 --wandb wandb-demo

During training, the program will automatically create a sub-folder ./weights/{model_name} to save model checkpoints/parameters.

Inference

Simply run the ./mxnetseg/eval.py with arguments need to be specified

python eval.py --model FCNResNet --backbone resnet18 --checkpoint fcn_resnet18_Cityscapes_20191900_310600_best.params --ctx 0 --data Cityscapes --crop 768 --base 2048 --mode val --ms

About the mode:

val: to get mIoU and PA metrics on the validation set.
test: to get colored predictions on the test set.
testval: to get colored predictions on the validation set.

Citations

Please kindly cite our paper if you feel our codes help in your research.

@article{tang2021attention,
  title={Attention-guided chained context aggregation for semantic segmentation},
  author={Tang, Quan and Liu, Fagui and Zhang, Tong and Jiang, Jun and Zhang, Yu},
  journal={Image and Vision Computing},
  pages={104309},
  year={2021},
  publisher={Elsevier}
}

@article{tang2021eprnet,
  title={EPRNet: Efficient Pyramid Representation Network for Real-Time Street Scene Segmentation},
  author={Tang, Quan and Liu, Fagui and Jiang, Jun and Zhang, Yu},
  journal={IEEE Transactions on Intelligent Transportation Systems},
  year={2021},
  doi={10.1109/TITS.2021.3066401},
  publisher={IEEE}
}

@article{tang2022compe,
  title={Compensating for Local Ambiguity With Encoder-Decoder in Urban Scene Segmentation}, 
  author={Tang, Quan and Liu, Fagui and Zhang, Tong and Jiang, Jun and Zhang, Yu and Zhu, Boyuan and Tang, Xuhao},
  journal={IEEE Transactions on Intelligent Transportation Systems},
  year={2022},
  doi={10.1109/TITS.2022.3157128},
  publisher={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
dataset		dataset
demo		demo
mxnetseg		mxnetseg
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MXNetSeg

Bright Spots

Benchmarks

Cityscapes

ADE20K

Pascal VOC 2012

NYUv2

Environment

Usage

Training

Inference

Citations

About

Languages

License

BebDong/MXNetSeg

Folders and files

Latest commit

History

Repository files navigation

MXNetSeg

Bright Spots

Benchmarks

Cityscapes

ADE20K

Pascal VOC 2012

NYUv2

Environment

Usage

Training

Inference

Citations

About

Resources

License

Stars

Watchers

Forks

Languages