kfengc27 / GLMC

[CVPR2023] Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[CVPR2023] Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions(GLMC)

by Fei Du, Peng Yang, Qi Jia, Fengtao Nan, Xiaoting Chen, Yun Yang

This is the official implementation of Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions

Update

Apologies for the oversight in our paper regarding the incorrect upload of the results for CIFAR-10. We have updated our GitHub repository and reported the final results for CIFAR-10-LT. Compared to the latest state-of-the-art work by BCL[1], our results are still 3% higher. We have also uploaded the latest paper on arXiv, and you can find it at the following link: Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions

The experimental setup was as follows:

python main.py --dataset cifar10 -a resnet32 --num_classes 10 --imbanlance_rate 0.01 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.0 --label_weighting 1.2 --contrast_weight 4

CIFAR-10-LT

Method IF Model Top-1 Acc(%)
GLMC 100 ResNet-32 87.75%
GLMC 50 ResNet-32 90.18%
GLMC 10 ResNet-32 94.04%
GLMC + MaxNorm 100 ResNet-32 87.57%
GLMC + MaxNorm 50 ResNet-32 90.22%
GLMC + MaxNorm 10 ResNet-32 94.03%

[1] Jianggang Zhu, ZhengWang, Jingjing Chen, Yi-Ping Phoebe Chen, and Yu-Gang Jiang. Balanced contrastive learning for long-tailed visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6908–6917, 2022. 2, 3, 5, 6

💥Meanwhile, We supplemented the experiment on iNaturelist2018 and achieved the state-of-the-art.

Method Model Many Med Few All model
GLMC ResNeXt-50 64.60 73.16 73.01 72.21 Download

Overview

An overview of our GLMC: two types of mixed-label augmented images are processed by an encoder network and a projection head to obtain the representation $h_g$ and $h_l$. Then a prediction head transforms the two representations to output $u_g$ and $u_l$. We minimize their negative cosine similarity as an auxiliary loss in the supervised loss. $sg(*)$ denotes stop gradient operation.

image

We propose an efficient one-stage training strategy for long-tailed visual recognition called Global and Local Mixture Consistency cumulative learning (GLMC). Our core ideas are twofold: (1) a global and local mixture consistency loss improves the robustness of the feature extractor. Specifically, we generate two augmented batches by the global MixUp and local CutMix from the same batch data, respectively, and then use cosine similarity to minimize the difference. (2) A cumulative head-tail soft label reweighted loss mitigates the head class bias problem. We use empirical class frequencies to reweight the mixed label of the head-tail class for long-tailed data and then balance the conventional loss and the rebalanced loss with a coefficient accumulated by epochs.

Getting Started

Requirements

All codes are written by Python 3.9 with

  • PyTorch = 1.10.0

  • torchvision = 0.11.1

  • numpy = 1.22.0

Preparing Datasets

Download the datasets CIFAR-10, CIFAR-100, ImageNet, and iNaturalist18 to GLMC-2023/data. The directory should look like

GLMC-2023/data
├── CIFAR-100-python
├── CIFAR-10-batches-py
├── ImageNet
|   └── train
|   └── val
├── train_val2018
└── data_txt
    └── ImageNet_LT_val.txt
    └── ImageNet_LT_train.txt
    └── iNaturalist18_train.txt
    └── iNaturalist18_val.txt
    

Training

for CIFAR-10-LT

python main.py --dataset cifar10 -a resnet32 --num_classes 10 --imbanlance_rate 0.01 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3 --resample_weighting 0.0 --label_weighting 1.2 --contrast_weight 4

for CIFAR-100-LT

python main.py --dataset cifar100 -a resnet32 --num_classes 100 --imbanlance_rate 0.01 --beta 0.5 --lr 0.01 --epochs 200 -b 64 --momentum 0.9 --weight_decay 5e-3
--resample_weighting 0.0 --label_weighting 1.2 --contrast_weight 10

for ImageNet-LT

python main.py --dataset ImageNet-LT -a resnext50_32x4d --num_classes 1000 --beta 0.5 --lr 0.1 --epochs 135 -b 120 --momentum 0.9 --weight_decay 2e-4 --resample_weighting 0.2 --label_weighting 1.0 --contrast_weight 10

for iNaturelist2018

python main.py --dataset iNaturelist2018 -a resnext50_32x4d --num_classes 8142 --beta 0.5 --lr 0.1 --epochs 120 -b 128 --momentum 0.9 --weight_decay 1e-4 --resample_weighting 0.2 --label_weighting 1.0 --contrast_weight 10

Testing

python test.py --dataset ImageNet-LT -a resnext50_32x4d --num_classes 1000 --resume model_path

Result and Pretrained models

CIFAR-10-LT

Method IF Model Top-1 Acc(%)
GLMC 100 ResNet-32 87.75%
GLMC 50 ResNet-32 90.18%
GLMC 10 ResNet-32 94.04%
GLMC + MaxNorm 100 ResNet-32 87.57%
GLMC + MaxNorm 50 ResNet-32 90.22%
GLMC + MaxNorm 10 ResNet-32 94.03%

CIFAR-100-LT

Method IF Model Top-1 Acc(%)
GLMC 100 ResNet-32 55.88
GLMC 50 ResNet-32 61.08
GLMC 10 ResNet-32 70.74
GLMC + MaxNorm 100 ResNet-32 57.11
GLMC + MaxNorm 50 ResNet-32 62.32
GLMC + MaxNorm 10 ResNet-32 72.33

ImageNet-LT

Method Model Many Med Few All model
GLMC ResNeXt-50 70.1 52.4 30.4 56.3 Download
GLMC + BS ResNeXt-50 64.76 55.67 42.19 57.21 Download

iNaturelist2018

Method Model Many Med Few All model
GLMC ResNeXt-50 64.60 73.16 73.01 72.21 Download

Citation

If you find this code useful for your research, please consider citing our paper

@inproceedings{
du2023global,
title={Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions},
author={Fei Du, Peng Yang, Qi Jia, Fengtao Nan, Xiaoting Chen, Yun Yang},
booktitle={Conference on Computer Vision and Pattern Recognition 2023},
year={2023},
url={https://openreview.net/forum?id=gLkKCqK3WOD}
}

About

[CVPR2023] Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions


Languages

Language:Python 100.0%