This repository contains the code (in Caffe) for the paper:
Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection
Yongcheng Liu, Lu Sheng, Jing Shao*, Junjie Yan, Shiming Xiang and Chunhong Pan
ACM Multimedia 2018
Project Page: https://yochengliu.github.io/MLIC-KD-WSD/
-
We use WSDDN as the detection model, i.e., the teacher model.
-
Because the released code of WSDDN is implemented using Matlab (based on MatConvNet), we first reproduce this paper using Caffe.
[1]. Hakan Bilen, Andrea Vedaldi, "Weakly Supervised Deep Detection Networks". In: IEEE Computer Vision and Pattern Recognition, 2016.
image_path one_hot_label_vector(e.g., 0 1 1 ...) proposal_info(e.g., x_min y_min x_max y_max score x_min y_min x_max y_max score ...)
./wsddn/wsddn_train(deploy).prototxt
-
VGG16 is used as the backbone model.
-
For training, we did not use spatial regularizer. More details can be referred in the paper.
-
For testing, you can use Pycaffe or Matcaffe.
-
The MLIC model in our framework, i.e., the student model, is very compact for efficiency.
-
It is constituted by a popular CNN model (VGG16, as the backbone model) following a fully connected layer (as the classifier).
-
The backbone model of the student could be different from the teacher's.
./kd/train_stage1.prototxt
./kd/train_stage2.prototxt
Datalist preparation is the same as mentioned in WSD. More details can be referred in our paper.
Please follow the instruction of Caffe.
./caffe
include
...
src
caffe
utils
interp.cpp/cu // bilinear interpolation
cross_entropy_loss_layer.cpp // cross entropy loss for WSDDN
data_transformer.cpp // data augmentation
human_att_data_layer.cpp // data layer
interp_layer.cpp // bilinear interpolation
roi_pooling_layer.cpp/cu // add score
wsd_roigen_layer.cpp // prepare rois for roi pooling
wsd_roigen_single_scale_layer.cpp // convert rois' coordinates according to the given scale
proto
caffe.proto // add some LayerParameters
Note: You shoud add the above codes to Caffe and compile them successfully.
The code has been tested successfully on Ubuntu 14.04 with CUDA 8.0.
If our paper [arXiv] is helpful for your research, please consider citing:
@inproceedings{liu2018mlickdwsd,
author = {Yongcheng Liu and
Lu Sheng and
Jing Shao and
Junjie Yan and
Shiming Xiang and
Chunhong Pan},
title = {Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection},
booktitle = {ACM International Conference on Multimedia},
pages = {1--9},
year = {2018}
}
If you have some ideas or questions about our research to share with us, please contact yongcheng.liu@nlpr.ia.ac.cn