lih627 / MLMSNet

Lightweight Multi-Level Multi-Scale Feature Fusion Network for Semantic Segmentation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MLMSNet

Lightweight Multi-Level Multi-Scale Feature Fusion Network for Semantic Segmentation.

Demo video: link

Update

  • 2021.05.13 Support 4 encoder: moiblenet large/small and width-multi 0.75/1.0.

  • 2021.05.13 Support mlmsnetv2, Use depth separable convolution for ASPP and SE block. (mobilenetv3 large backbone, width-multi 1.0) 713x713 input: Flops 14.2209G, Params 3.5739M.

  • 2021.05.30 Support CamVid Dataset. The link to download CamVid for our segmentation method.

Dataset Preparation

  1. Download Cityscapes images from Cityscape website. We need gtFine_trainvaltest.zip (241MB) and leftImg8bit_trainvaltest.zip (11GB) . The leftImg8bit_demoVideo.zip (6.6GB) is optional, which is used to generate the demo segmentation results.

  2. Download Cityscapes Scripts. Use createTrainIdLabelImgs.py to generate the trainIDs from cityscapes json annotations.

  3. link cityscapes dataset to the project dataset folder.

    ln -s path/to/cityscapes  path/to/MLMSNet/dataset/

    The dataset folder structure is list as follows:

    cityscapes -> path/to/cityscapes
    ├── cityscapes_demo.txt
    ├── cityscapes_test.txt
    ├── demoVideo
    │   ├── stuttgart_00
    │   ├── stuttgart_01
    │   └── stuttgart_02
    ├── fine_train.txt
    ├── fine_val.txt
    ├── gtFine
    │   ├── test
    │   ├── train
    │   └── val
    └── leftImg8bit
        ├── test
        ├── train
        └── val
    
  4. cityscapes_demo.txt, cityscapes_test.txt, fine_train.txt ,fine_val.txt is in MLMSNet/misc

Train

cd path/to/MLMSNet

export PYTHONPATH=./

python tool/train.py --config config/cityscapes_ohem_large.yaml  2>&1 | tee ohem_largetrain.log

This is the training result using the default configuration parameters, corresponding to config/cityscapes_ohem_large.yaml:

INFO:main-logger:Val result: mIoU/mAcc/allAcc 0.6695/0.7546/0.9541.

Evaluation

cd path/to/MLMSNet

export PYTHONPATH=./

python tool/test.py --config config/cityscapes_ohem_large.yaml  2>&1 | tee ohem_large_test.log

This is the evaluation result using the default configuration parameters, corresponding to config/cityscapes_ohem_large.yaml:

Eval result: mIoU/mAcc/allAcc 0.7268/0.7991/0.9540.

Pretrained Model

You can download pretrained models from Google Drive.

Cityscapes

Model val mIoU/mAcc/allAcc config link
MLMS-L 0.7268/0.7991/0.9540 cityscapes_ohem_large.yaml MLMS_L
MLMS-S 0.7274/0.8033/0.9537 cityscapes_ohem_small.yaml MLMS_S
MLMSv2-L 0.7164/0.7982/0.9526 mlmsv2_large.yaml MLMSv2_L

CamVid

Model val mIoU/mAcc/allAcc config link
camvid-mlms-l 0.6814/0.7574/0.9196 camvid_ohem_large.yaml camvid_mlms_l
camvid-mlms-s 0.6790/0.7612/0.9188 camvid_ohem_small.yaml camvid_mlms_s

Reference

  1. The codebase is from semseg: Semantic Segmentation in Pytorch:

    @misc{semseg2019,
      author={Zhao, Hengshuang},
      title={semseg},
      howpublished={\url{https://github.com/hszhao/semseg}},
      year={2019}
    }
    @inproceedings{zhao2017pspnet,
      title={Pyramid Scene Parsing Network},
      author={Zhao, Hengshuang and Shi, Jianping and Qi, Xiaojuan and Wang, Xiaogang and Jia, Jiaya},
      booktitle={CVPR},
      year={2017}
    }
    @inproceedings{zhao2018psanet,
      title={{PSANet}: Point-wise Spatial Attention Network for Scene Parsing},
      author={Zhao, Hengshuang and Zhang, Yi and Liu, Shu and Shi, Jianping and Loy, Chen Change and Lin, Dahua and Jia, Jiaya},
      booktitle={ECCV},
      year={2018}
    }
    
  2. The MobileNetv3 code and pretrained model is from mobilenetv3.pytorch: 74.3% MobileNetV3-Large and 67.2% MobileNetV3-Small model on ImageNet

  3. This project gave me a better understanding of the loss function related to the segmentation field: SegLoss

About

Lightweight Multi-Level Multi-Scale Feature Fusion Network for Semantic Segmentation

License:MIT License


Languages

Language:Python 100.0%