Semantic Segmentation on Kitti Road Dataset using EfficientNet-B0 in PyTorch
An encoder-decoder model is used to perform semantic segmentation on Kitti Roaad Dataset in PyTorch. The encoder encodes the input images onto a low dimensional discriminative feature set and the decoder projects back the learnt features onto the high dimensional pixel space. The encoder uses a pre-trained EfficientNet B0 model and the decoder uses Transpose Convolutional layers to upsize the images back to its original size.
- utils.py: set hyperparameters and file paths.
- dataloader.py: load train and validation data into PyTorch DataLoader.
- model.py: Define the encoder-decoder model.
- train.py: Training procedure.
- J. Fritsch, T. Kühnl and A. Geiger, "A new performance measure and evaluation benchmark for road detection algorithms," 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013), The Hague, 2013, pp. 1693-1700, doi: 10.1109/ITSC.2013.6728473.
- M. Tan and Q. V Le, "EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks", arXiv:1905.11946
- https://github.com/udacity/CarND-Semantic-Segmentation/
- https://nanonets.com/blog/how-to-do-semantic-segmentation-using-deep-learning/