pytorch weakly-supervised-detection object object-detection

D-MIL: Discrepant Multiple Instance Learning for Weakly Supervised Object Detection

This is the official pytorch implementation of our paper Discrepant Multiple Instance Learning for Weakly Supervised Object Detection, which is accepted by Pattern Recognition.

This implementation is based on jwyang's pytorch-faster-rcnn and ppengtang's pcl.pytorch.

Please go to other branches if you want to train D-MIL on COCO dataset or using ResNet as Backbone. For retraining on pascal voc 2007 and 2012 dataset based on Fast R-CNN, you can also go to the corresponding branch.

Using vgg16 as backbone, the trained model has detection mAP 53.5 on PASCAL VOC 2007 and 49.6 on PASCAL VOC 2012

Performances

1). On PASCAL VOC 2007 dataset

model	#GPUs	batch size	lr	lr_decay	max_epoch	time/epoch	mAP	CorLoc
VGG-16	1	2	5e-4	10	18	2 hr	53.5	68.7

2). On PASCAL VOC 2012 dataset

model	#GPUs	batch size	lr	lr_decay	max_epoch	time/epoch	mAP	CorLoc
VGG-16	1	2	5e-4	10	18	-	49.6	70.1

Prerequisites

Nvidia GPU Tesla V100
Ubuntu 16.04 LTS
python 3.6
pytorch version in 1.0 ~ 1.4 is required.
tensorflow, tensorboard and tensorboardX for visualizing training and validation curve.

Installation

Clone the repository

git clone https://github.com/vasgaowei/D-MIL.pyorch.git

Compile the modules(nms, roi_pooling, roi_ring_pooling and roi_align)

cd D-MIL.pytorch/lib
bash make_cuda.sh

Setup the data

Download the training, validation, test data and the VOCdevkit

cd D-MIL.pyorch/
mkdir data
cd data/
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar

Extract all of these tars into one directory named VOCdevkit

tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar

Create symlinks for PASCAL VOC dataset or just rename the VOCdevkit to VOCdevkit2007

cd D-MIL.pyorch/data
ln -s VOCdevkit VOCdevkit2007

It should have this basic structure

$VOCdevkit2007/                     # development kit
$VOCdevkit2007/VOC2007/             # VOC utility code
$VOCdevkit2007/VOCcode/             # image sets, annodations, etc

And for PASCAL VOC 2010 and PASCAL VOC 2012, just following the similar steps.

Download the pre-trained ImageNet models

VGG16: Dropbox, VT Server and put it in the data/pretrained_model and rename it vgg16_caffe.pth. The folder has the following form.

$ data/pretrained_model/vgg16_caffe.pth

Download the Selective Search proposals for PASCAL VOC 2007

Download it from: https://dl.dropboxusercontent.com/s/orrt7o6bp6ae0tc/selective_search_data.tgz and unzip it and the final folder has the following form

$ data/selective_search_data/voc_2007_test.mat
$ data/selective_search_data/voc_2007_trainval.mat
$ data/selective_search_data/voc_2012_test.mat
$ data/selective_search_data/voc_2012_trainval.mat

Train your own model

For vgg16 backbone, we can train and evaluate the model using the following commands

bash both_2007.sh $prefix $GPU_ID

And for evaluation on detection mAP, we can using the following commands

bash test_test_2007.sh $prefix

And for evaluation on CorLoc, we can using the following commands

bash test_corloc_2007.sh $prefix

Retrain using Fast RCNN

First, run the following commands to get the pseudo ground-truths

bash retrain_VOC.sh $prefix

The we will get annotations of pseudo ground-truths for retraining Fast RCNN. These annotations are located in the following folder:

$VOCdevkit2007/VOC2007/retrain_annotation_score_top1             # VOC utility code

For retraining Fast RCNN on PASCAL VOC 2012, we can change codes in line 8, 9, 18 and 19 in file retrain_VOC.sh file, where we changing the dataset from VOC 2007 to VOC 2012 The codes for retraining Fast RCNN is in branch fast-rcnn-retrain-07 and branch fast-rcnn-retrain-12. Please go to the corresponding branch for relevant configurations.

Training and testing on COCO 2014 dataset

The codes for training and testing on COCO dataset are in branch D-MIL-COCO. Please go to the corresponding branch for relavant settings.

Training on ResNet

As mentioned in paper DRN, it's not trivial do train a WSOD model on non-plain backbone(e.g., ResNet, DenseNet). And for evaluating the effectiveness of D-MIL on ResNet, we implement our model based on DRN. Check corresponding branch D-MIL-ResNet for more details.

Citation

If you find this repository is useful and use this code for a paper please cite:

@article{gao2021discrepant,
  title={Discrepant Multiple Instance Learning for Weakly Supervised Object Detection},
  author={Gao, Wei and Wan, Fang and Yue, Jun and Xu, Songcen and Ye, Qixiang},
  journal={Pattern Recognition},
  pages={108233},
  year={2021},
  publisher={Elsevier}
}

About

Codes for: D-MIL: Discrepant multiple instance learning for weakly supervised object detection

https://www.sciencedirect.com/science/article/abs/pii/S0031320321004143

pytorch weakly-supervised-detection object object-detection

MIT License

Languages

Language:Jupyter Notebook 89.9%Language:Python 6.5%Language:Cuda 1.4%Language:C 1.1%Language:C++ 0.7%Language:Cython 0.3%Language:Shell 0.1%Language:MATLAB 0.0%