Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection

This repository contains the code (in Caffe) for the paper:

Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection
Yongcheng Liu, Lu Sheng, Jing Shao*, Junjie Yan, Shiming Xiang and Chunhong Pan
ACM Multimedia 2018

Project Page: https://yochengliu.github.io/MLIC-KD-WSD/

Weakly Supervised Detection (WSD)

We use WSDDN as the detection model, i.e., the teacher model.
Because the released code of WSDDN is implemented using Matlab (based on MatConvNet), we first reproduce this paper using Caffe.

[1]. Hakan Bilen, Andrea Vedaldi, "Weakly Supervised Deep Detection Networks". In: IEEE Computer Vision and Pattern Recognition, 2016.

Datalist Preparation

image_path one_hot_label_vector(e.g., 0 1 1 ...) proposal_info(e.g., x_min y_min x_max y_max score x_min y_min x_max y_max score ...)

Training & Test

    ./wsddn/wsddn_train(deploy).prototxt

VGG16 is used as the backbone model.
For training, we did not use spatial regularizer. More details can be referred in the paper.
For testing, you can use Pycaffe or Matcaffe.

Multi-Label Image Classification (MLIC)

The MLIC model in our framework, i.e., the student model, is very compact for efficiency.
It is constituted by a popular CNN model (VGG16, as the backbone model) following a fully connected layer (as the classifier).
The backbone model of the student could be different from the teacher's.

Cross-Task Knowledge Distillation

Stage 1: Feature-Level Knowledge Transfer

    ./kd/train_stage1.prototxt

Stage 2: Prediction-Level Knowledge Transfer

    ./kd/train_stage2.prototxt

Datalist preparation is the same as mentioned in WSD. More details can be referred in our paper.

Caffe

Installation

Please follow the instruction of Caffe.

Implementation

    ./caffe
        include
            ...
        src
            caffe
                utils
                    interp.cpp/cu                   // bilinear interpolation
                cross_entropy_loss_layer.cpp        // cross entropy loss for WSDDN
                data_transformer.cpp                // data augmentation
                human_att_data_layer.cpp            // data layer
                interp_layer.cpp                    // bilinear interpolation
                roi_pooling_layer.cpp/cu            // add score
                wsd_roigen_layer.cpp                // prepare rois for roi pooling
                wsd_roigen_single_scale_layer.cpp   // convert rois' coordinates according to the given scale
            proto
                caffe.proto                         // add some LayerParameters

Note: You shoud add the above codes to Caffe and compile them successfully.

The code has been tested successfully on Ubuntu 14.04 with CUDA 8.0.

Citation

If our paper [arXiv] is helpful for your research, please consider citing:

    @inproceedings{liu2018mlickdwsd,   
      author = {Yongcheng Liu and    
                Lu Sheng and    
                Jing Shao and   
                Junjie Yan and   
                Shiming Xiang and   
                Chunhong Pan},   
      title = {Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection},   
      booktitle = {ACM International Conference on Multimedia},    
      pages = {1--9},  
      year = {2018}   
    }

Contact

If you have some ideas or questions about our research to share with us, please contact yongcheng.liu@nlpr.ia.ac.cn

Ai-is-light / MLIC-KD-WSD