SketchyScene-pytorch

This repository is the official PyTorch implementation of semantic segmentation (adapted DeepLab-v2) and instance segmentation (adapted Mask R-CNN) on SketchyScene dataset. (ECCV 2018)

Tensorflow code | Paper | Project Page | Dataset

Outline

Semantic Segmentation
Instance Segmentation
Citation

Semantic Segmentation

See the code under Semantic_Segmentation directory.

Requirements

Python 3.5
PyTorch 0.4.1
torchvision 0.2.1
pydensecrf
(Optional) Tensorflow (>= 1.4.0)

Preparations

Download the whole SketchyScene dataset and place them under data directory following its instructions.
Download the ImageNet pre-trained "ResNet-101" pytorch model here (for initial training) and place it under the resnet_pretrained_model directory.
We provide the implementation of converting our trained tensorflow model to pytorch model, see this section.

Train

For training based on ImageNet pre-trained "ResNet-101" model:

python3 semantic_main.py --mode='train' --init_with='resnet' --log_info=1 --ignore_class_bg=1

set --init_with from ['resnet', 'last', 'none']. Train from the fresh start with 'none'. Your lastly trained model will be found automatically if setting 'last'.
--log_info=1 means log infomation will be summarized and you can check with Tensorboard. Note that this requires tensorflow and tensorboard environment. This function benifits from logger.py.
--ignore_class_bg=1 means using our proposed background-ignoring strategy. Otherwise, set it to 0.
Other training parameters can be modified in configs.py.

Evaluation

Make sure that your trained pytorch model is under the directory Semantic_Segmentation/outputs/snapshot. DenseCRF can be used to improve the segmentation performance as a post-processing skill.

For evaluation under val/test dataset without/with DenseCRF, run:

python3 semantic_main.py --mode='val' --dcrf=0
python3 semantic_main.py --mode='test' --dcrf=1

You can convert the tensorflow trained model following this section or directly download here.

Inference

For inference with the 2-nd image in val dataset with DenseCRF, which the background is white, run:

python3 semantic_main.py --mode='inference' --infer_dataset='val' --image_id=2 --dcrf=1 --black_bg=0

set --infer_dataset='test' for inference under test dataset
set --image_id to other number for other image
set --black_bg=1 with the result in black background. Otherwise, it is white.

Also, you can try our converted pytorch model.

Model Conversion

We provide the implementation of converting our trained tensorflow model to pytorch model.

Run the convert_tf2pth.py under tools folder like these:

python3 convert_tf2pth.py --ignore_class_bg=1 --tf_model_dir=dir/to/tfmodel --display=1 --dataset='val' --image_id=2
python3 convert_tf2pth.py --ignore_class_bg=1 --tf_model_dir==dir/to/tfmodel --display=0

set --ignore_class_bg=1 because our tensorflow implementation use this strategy
set --tf_model_dir to where you place the tensorflow model and checkpoint
--display=1 means a sample scene sketch will be tested and the semantic results from both the tf model and converted pytorch model will be displayed. Remember to set --dataset and --image_id.

We evaluated the converted pytorch model and got the results in the following table:

Model	OVAcc		MeanAcc		MIoU		FWIoU
Model	val	test	val	test	val	test	val	test
Official TensorFlow model	92.94	88.38	84.95	75.92	73.49	63.10	87.10	79.76
Converted Pytorch model	92.71	88.09	84.23	75.50	71.98	62.27	86.73	79.34

The results are a bit different mainly due to the Precision Lossing between the two frameworks.

Instance Segmentation

See the code under Instance_Segmentation directory.

Requirements

Python 3.5
PyTorch 0.4.1
torchvision 0.2.1
(Optional) Tensorflow (>= 1.4.0)

Preparations

Download the whole SketchyScene dataset and place them under data directory following its instructions.
Download the coco/imagenet pre-trained model following the instructions under Instance_Segmentation/pretrained_model.
We provide the implementation of converting our trained Keras(Tensorflow) model to pytorch model, see this section.

Setup the Non-Maximum Suppression (from ruotianluo/pytorch-faster-rcnn) and RoiAlign (from longcw/RoIAlign.pytorch) environment as following:

cd libs/nms/src/cuda/
nvcc -c -o nms_kernel.cu.o nms_kernel.cu -x cu -Xcompiler -fPIC -arch=[arch]
cd ../../
python build.py
cd ../

cd roialign/roi_align/src/cuda/
nvcc -c -o crop_and_resize_kernel.cu.o crop_and_resize_kernel.cu -x cu -Xcompiler -fPIC -arch=[arch]
cd ../../
python build.py

choose the value of --arch as following:

GPU arch

TitanX sm_52

GTX 960M sm_50

GTX 1070 sm_61

GTX 1080 (Ti) sm_61

GPU	arch
TitanX	sm_52
GTX 960M	sm_50
GTX 1070	sm_61
GTX 1080 (Ti)	sm_61

Train

After the preparations, run:

python3 segment_train.py

python3 segment_train.py --init_model='coco' --log_info=1

Choose the initial pre-trained model from ['coco', 'imagenet', 'last'] at --init_model. Train from the fresh start if not specified. 'last' denotes your lastly trained model.
--log_info=1 means log infomation will be summarized and you can check with Tensorboard. Note that this requires tensorflow and tensorboard environment. This function benifits from logger.py.
Other settings can be modified at SketchTrainConfig in this file.

Evaluation

Make sure that your trained model is under the directory Instance_Segmentation/outputs/snapshot.

For evaluation under val/test dataset, run:

python3 segment_evaluate.py --dataset='test' --epochs='0100' --use_edgelist=0
python3 segment_evaluate.py --dataset='val' --epochs='0100' --use_edgelist=1

Set --epochs to the last four digits of the name of your trained model.
Edgelist is used if setting --use_edgelist=1. Note that if you want to use edgelist as post-processing, make sure you have generated the edgelist labels following the instructions under Instance_Segmentation/libs/edgelist_utils_matlab.

You can convert the keras(tensorflow) trained model following this section or directly download here.

Inference

For inference with the 2nd image in val dataset without edgelist, run:

python3 segment_inference.py --dataset='val' --image_id=2 --epochs='0100' --use_edgelist=0

Inference under test dataset if setting --dataset='test'
Try other image if setting --image_id to other number
Set the --epochs to the last four digits of your trained model
Edgelist is used if setting --use_edgelist=1. Also make sure the edgelist labels have been generated.

Also, you can try converted pytorch model.

Model Conversion

We provide the implementation of converting our trained keras(tensorflow) model to pytorch model.

Run the convert_from_keras.py under tools folder like this:

python3 convert_from_keras.py --keras_model=path/to/keras-model --pytorch_model=path/of/converted-model

We evaluated the converted pytorch model and got the results in the following table:

Model	val			test
Model	AP	AP@0.5	AP@0.75	AP	AP@0.5	AP@0.75
Official Keras model	63.01	79.97	68.18	62.32	77.15	66.76
Official Keras model + edgelist	63.78	80.19	68.88	63.17	77.45	67.60
Converted Pytorch model	62.99	80.02	68.94	62.17	77.05	66.81
Converted Pytorch model + edgelist	63.92	80.28	69.26	63.11	77.39	67.74

The results are a bit different mainly due to the Precision Lossing between the two frameworks.

Citation

Please cite the corresponding paper if you found our datasets or code useful:

@inproceedings{Zou18SketchyScene,
  author    = {Changqing Zou and
                Qian Yu and
                Ruofei Du and
                Haoran Mo and
                Yi-Zhe Song and
                Tao Xiang and
                Chengying Gao and
                Baoquan Chen and
                Hao Zhang},
  title     = {SketchyScene: Richly-Annotated Scene Sketches},
  booktitle = {ECCV},
  year      = {2018},
  publisher = {Springer International Publishing},
  pages		= {438--454},
  doi		= {10.1007/978-3-030-01267-0_26},
  url		= {https://github.com/SketchyScene/SketchyScene}
}

Credits

The ResNet-101 pytorch model was converted from caffe model by ruotianluo.
The code for the pytorch DeepLab model is partly borrowed from chenxi116.
The code for the pytorch Mask R-CNN model is modified from matterport/Mask_RCNN, multimodallearning/pytorch-mask-rcnn, and jytime/Mask_RCNN_Pytorch.
The code for tensorboard visualization is from yunjey.

MarkMoHR / SketchyScene-pytorch

SketchyScene-pytorch

Outline

Semantic Segmentation

Requirements

Preparations

Train

Evaluation

Inference

Model Conversion

Instance Segmentation

Requirements

Preparations

Train

Evaluation

Inference

Model Conversion

Citation

Credits

About

Languages