Weakly Supervised Semantic Segmentation Based on Web Image Co-segmentation

Authors: Tong Shen, Guosheng Lin, Lingqiao Liu, Chunhua Shen, Ian Reid

Abstract

Training a Fully Convolutional Network (FCN) for semantic segmentation requires a large number of masks with pixel level labelling, which involves a large amount of human labour and time for annotation. In contrast, web images and their image-level labels are much easier and cheaper to obtain. In this work, we propose a novel method for weakly supervised semantic segmentation with only image-level labels. The method utilizes the internet to retrieve a large number of images and uses a large scale co-segmentation framework to generate masks for the retrieved images. We first retrieve images from search engines, e.g. Flickr and Google, using semantic class names as queries, e.g. class names in the dataset PASCAL VOC 2012. We then use high quality masks produced by co-segmentation on the retrieved images as well as the target dataset images with image level labels to train segmentation networks. We obtain an IoU score of 56.9 on test set of PASCAL VOC 2012, which reaches the state-of-the-art performance.

Citing the paper

Please consider citing us if you find it useful:

    @inproceedings{Shen:2017:wss,
      author    = {Tong Shen and
                   Guosheng Lin and
                   Lingqiao Liu and
                   Chunhua Shen and
                   Ian Reid},
      title     = {Weakly Supervised Semantic Segmentation Based on Web Image Co-segmentation},
      booktitle = {BMVC},
      year      = {2017}
    }

Dependencies

The code is implemented in MXNet. Please go to the official website (HERE) for installation. Please make sure the MXNet is compiled with OpenCV support.

The other python dependences can be found in "dependencies.txt" and can be installed:

pip install -r dependencies.txt

Dataset

Web data

The Web data can be downloaded here. Since the co-segmentation code is not included (Original Github), one can either run the code to get the masks or use the masks provided, which are already processed. To use the provided masks, extract the files and put all the images and masks in "dataset/web_images" and "dataset/web_labels" respectively. No subfolders should be used.

PACAL VOC data

For PASCAL VOC data, please download PASCAL VOC12 (HERE) and SBD (HERE). Then extract the files into folder "dataset" and run:

python create_dataset.py

Training

First download the Resnet50 model pretrained on ImageNet (HERE). Put it in folder "models".

Training the Initial Mask Generator

To train the "Initial Mask Generator", simply run:

python train_seg_model.py --model init --gpus 0,1,2,3

To evaluate a certain snapshot (for example epoch X), run:

python eval_seg_model.py --model init --gpu 0 --epoch X

To evaluate all the snapshots, run:

python eval_loop.py --model init --gpu 0

The evaluated snapshots will have a corresponding folder in "outputs". This eval_loop.py will check if there is any unevaluated snapshots and evaluate them.

To further improve the score, finetune a snapshot (for example epoch X) with smaller learning rate:

python train_seg_model.py --model init --gpus 0,1,2,3 --epoch X --lr 16e-5

Training the Final Model

Check the evaluation log in "log/eval_model.log" and find the best snapshot (Download a trained one HERE) for the mask generator. For example the best epoch is "X", then run:

python est_voc_train_masks.py --gpu 0 --epoch X
python train_seg_model.py --model final --gpus 0,1,2,3
python eval_loop.py --model final --gpu 0

The above code will estimate the masks for the VOC training images and train the final model.

Evaluation

The snapshots will be saved in folder "snapshots". To evaluate a snapshot, simply use (for example epoch X):

python eval_seg_model.py --model final --gpu 0 --epoch X

There are other flags:

--ms                use multi-scale for inference
--savemask          save output masks
--crf               use CRF as postprocessing

There is a trained model that can be downloaded HERE.

Download the model and put it in folder "snapshots". Run:

python eval_seg_model.py --model final --gpu 0 --epoch 23 --crf --ms

It will get IoU of 56.4, as reported in the paper.

Demo

A demo code is given in "demo". Download the final model (HERE) and put it in the folder "snapshots". Please use Jupyter to run "Demo.ipynb".

Examples

There are some examples here.

ascust / wsscoseg