gaobb / MCAR

[TIP] Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MCAR.pytorch

This repository is a PyTorch implementation of Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition. The paper is accepted at [IEEE Trans. Image Processing (TIP 2021). This repo is created by Bin-Bin Gao.

PWC PWC PWC

MCAR Framework

Requirements

Please, install the following packages

  • numpy
  • torch-0.4.1
  • torchnet
  • torchvision-0.2.0
  • tqdm

Options

  • topN: number of local regions
  • threshold: threshold of localization
  • ps: global pooling style, e.g., 'avg', 'max', 'gwp'
  • lr: learning rate
  • lrp: factor for learning rate of pretrained layers. The learning rate of the pretrained layers is lr * lrp
  • batch-size: number of images per batch
  • image-size: size of the image
  • epochs: number of training epochs
  • evaluate: evaluate model on validation set
  • resume: path to checkpoint

MCAR Training and Evaluation

bash run.sh
Model Input-Size VOC-2007 VOC-2012 COCO-2014
MobileNet-v2 256 x 256 88.1 model - 69.8 model
ResNet-50 256 x 256 92.3 model - 78.0 model
ResNet-101 256 x 256 93.0 model - 79.4 model
MobileNet-v2 448 x 448 91.3 model 91.0 75.0 Model
ResNet-50 448 x 448 94.1 model 93.5 82.1 model
ResNet-101 448 x 448 94.8 model 94.3 83.8 model

MCAR Demo

bash run_demo.sh

mcar-demo

Citing this repository

If you find this code useful in your research, please consider citing us:

@ARTICLE{MCAR_TIP_2021,
         author = {Bin-Bin Gao, Hong-Yu Zhou},
         title = {{Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition}},
         booktitle = {IEEE Transactions on Image Processing (TIP)},
         year={2021},
         volume={30},
         pages={5920-5932},
}

Reference

This project is based on the following implementations:

Tips

If you have any questions about our work, please do not hesitate to contact us by emails.

About

[TIP] Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition


Languages

Language:Python 97.7%Language:Shell 2.3%