PyTorch's Semantic Segmentation Toolbox
- Python 3
- PyTorch >= 1.0.0
- Backbone
- VGG16
- MobileNet v1 (1.0)
- MobileNet v2 (1.0)
- ResNet 34
- ResNet 50 (Modified according to Bag of Tricks)
- SE ResNet 34
- SE ResNet 50
- Modified Aligned Xception
- Model
- Loss
- BCEWithLogitsLossWithOHEM
- CrossEntropyLossWithOHEM
- DiceLoss (only for binary classification)
- SoftCrossEntropyLossWithOHEM
- Metric
- Pixel Accuracy
- Mean IoU
Each model can choose any backbone without any modification
Backbone \ Model | FCN | U-Net | PSPNet | DeepLab v3+ |
---|---|---|---|---|
VGG16 | √ | √ | √ | √ |
MobileNet v1 | √ | √ | √ | √ |
MobileNet v2 | √ | √ | √ | √ |
ResNet34 | √ | √ | √ | √ |
ResNet50 | √ | √ | √ | √ |
SE ResNet34 | √ | √ | √ | √ |
SE ResNet50 | √ | √ | √ | √ |
Modified Aligned Xception | √ | √ | √ | √ |
Model pre-trained on augmented PASCAL VOC2012 dataset with 10582 images for training and 1449 images for validation.
You can download pre-trained parameters at Google Drive
Original Image | Target Mask | Predict Mask |
---|---|---|
See Docs
See Changelog
- Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
- Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural networks for mobile vision applications." arXiv preprint arXiv:1704.04861 (2017).
- Sandler, Mark, et al. "Mobilenetv2: Inverted residuals and linear bottlenecks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
- He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
- Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
- Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
- Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.
- Zhao, Hengshuang, et al. "Pyramid scene parsing network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
- Chen, Liang-Chieh, et al. "Encoder-decoder with atrous separable convolution for semantic image segmentation." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
- Xie, Junyuan, et al. "Bag of tricks for image classification with convolutional neural networks." arXiv preprint arXiv:1812.01187 (2018).