ms-coco multi-label-classification pascal-voc

MCAR.pytorch

This repository is a PyTorch implementation of Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition. The paper is accepted at [IEEE Trans. Image Processing (TIP 2021). This repo is created by Bin-Bin Gao.

MCAR Framework

Requirements

Please, install the following packages

numpy
torch-0.4.1
torchnet
torchvision-0.2.0
tqdm

Options

topN: number of local regions
threshold: threshold of localization
ps: global pooling style, e.g., 'avg', 'max', 'gwp'
lr: learning rate
lrp: factor for learning rate of pretrained layers. The learning rate of the pretrained layers is lr * lrp
batch-size: number of images per batch
image-size: size of the image
epochs: number of training epochs
evaluate: evaluate model on validation set
resume: path to checkpoint

MCAR Training and Evaluation

bash run.sh

Model	Input-Size	VOC-2007	VOC-2012	COCO-2014
MobileNet-v2	256 x 256	88.1 model	-	69.8 model
ResNet-50	256 x 256	92.3 model	-	78.0 model
ResNet-101	256 x 256	93.0 model	-	79.4 model
MobileNet-v2	448 x 448	91.3 model	91.0	75.0 Model
ResNet-50	448 x 448	94.1 model	93.5	82.1 model
ResNet-101	448 x 448	94.8 model	94.3	83.8 model

MCAR Demo

bash run_demo.sh

Citing this repository

If you find this code useful in your research, please consider citing us:

@ARTICLE{MCAR_TIP_2021,
         author = {Bin-Bin Gao, Hong-Yu Zhou},
         title = {{Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition}},
         booktitle = {IEEE Transactions on Image Processing (TIP)},
         year={2021},
         volume={30},
         pages={5920-5932},
}

Reference

This project is based on the following implementations:

Tips

If you have any questions about our work, please do not hesitate to contact us by emails.

About

[TIP] Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition

ms-coco multi-label-classification pascal-voc

Languages

Language:Python 97.7%Language:Shell 2.3%