This project aims at providing a concise, easy-to-use, modifiable reference implementation for semantic segmentation models using PyTorch.
- add distributed training (Note: I have no enough device to test distributed, If you are interested in it, welcome to complete testing and fix bugs.)
- add OCNet
- Python 3.x
- PyTorch 1.0
conda install pytorch torchvision -c pytorch
- Ninja
wget https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip
sudo unzip ninja-linux.zip -d /usr/local/bin/
sudo update-alternatives --install /usr/bin/ninja ninja /usr/local/bin/ninja 1 --force
- Single GPU training
# for example, train fcn32_vgg16_pascal_voc:
python train.py --model fcn32s --backbone vgg16 --dataset pascal_voc --lr 0.0001 --epochs 50
- Multi-GPU training
# for example, train fcn32_vgg16_pascal_voc with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py --model fcn32s --backbone vgg16 --dataset pascal_voc --lr 0.0001 --epochs 50
Note: The loss functions of EncNet and ICNet are special, MixSoftmaxCrossEntropyLoss
need to be replaced by EncNetLoss
and ICNetLoss
in train.py
, respectively.
- Single GPU training
# for example, evaluate fcn32_vgg16_pascal_voc
python eval.py --model fcn32s --backbone vgg16 --dataset pascal_voc
- Multi-GPU training
# for example, evaluate fcn32_vgg16_pascal_voc with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS --model fcn32s --backbone vgg16 --dataset pascal_voc
cd ./scripts
python demo.py --model fcn32s_vgg16_voc --input-pic ./datasets/test.jpg
.{SEG_ROOT}
├── scripts
│ ├── demo.py
│ ├── eval.py
│ └── train.py
DETAILS for model & backbone.
.{SEG_ROOT}
├── core
│ ├── models
│ │ ├── bisenet.py
│ │ ├── danet.py
│ │ ├── deeplabv3.py
│ │ ├── denseaspp.py
│ │ ├── dunet.py
│ │ ├── encnet.py
│ │ ├── fcn.py
│ │ ├── pspnet.py
│ │ ├── icnet.py
│ │ ├── enet.py
│ │ ├── ocnet.py
│ │ ├── ......
You can run script to download dataset, such as:
cd ./core/data/downloader
python ade20k.py --download-dir ../datasets/ade
Dataset | training set | validation set | testing set |
---|---|---|---|
VOC2012 | 1464 | 1449 | ✘ |
VOCAug | 11355 | 2857 | ✘ |
ADK20K | 20210 | 2000 | ✘ |
Cityscapes | 2975 | 500 | ✘ |
COCO | |||
SBU-shadow | 4085 | 638 | ✘ |
LIP(Look into Person) | 30462 | 10000 | 10000 |
.{SEG_ROOT}
├── core
│ ├── data
│ │ ├── dataloader
│ │ │ ├── ade.py
│ │ │ ├── cityscapes.py
│ │ │ ├── mscoco.py
│ │ │ ├── pascal_aug.py
│ │ │ ├── pascal_voc.py
│ │ │ ├── sbu_shadow.py
│ │ └── downloader
│ │ ├── ade20k.py
│ │ ├── cityscapes.py
│ │ ├── mscoco.py
│ │ ├── pascal_voc.py
│ │ └── sbu_shadow.py
- PASCAL VOC 2012
Methods | Backbone | TrainSet | EvalSet | crops_size | epochs | Mean IoU | pixAcc |
---|---|---|---|---|---|---|---|
FCN32s | vgg16 | train | val | 480 | 60 | 47.50% | 85.39% |
FCN16s | vgg16 | train | val | 480 | 60 | 49.16% | 85.98% |
FCN8s | vgg16 | train | val | 480 | 60 | 48.87% | 85.02% |
PSPNet | resnet50 | train | val | 480 | 60 | 63.44% | 89.78% |
DeepLabv3 | resnet50 | train | val | 480 | 60 | 60.15% | 88.36% |
Note: The parameter settings of each method are different, including crop_size, learning rata, epochs, etc. For specific parameters, please see paper.
See TEST for details.
.{SEG_ROOT}
├── tests
│ └── test_model.py
- fix downloader
- optim loss
- test distributed training
- add distributed training (How DIST?)
- add LIP dataset
- add more models (in process)
- train and evaluate
- fix SyncBN (Why SyncBN?)