wuyuebupt / LargeScaleIncrementalLearning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LargeScaleIncrementalLearning

This is the implementation of CVPR 2019 paper "Large Scale Incremental Learning". If the paper and code helps you, we would appreciate your kindly citations of our paper.

@inproceedings{wu2019large,
  title={Large Scale Incremental Learning},
  author={Wu, Yue and Chen, Yinpeng and Wang, Lijuan and Ye, Yuancheng and Liu, Zicheng and Guo, Yandong and Fu, Yun},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={374--382},
  year={2019}
}

Abstract

In this paper, we proposed a new method to address the imbalance issue in incremental learning, which is critical when the number of classes becomes large. Firstly, we validated our hypothesis that the classifier layer (the last fully connected layer) has a strong bias towards the new classes, which has substantially more training data than the old classes. Secondly, we found that this bias can be effectively corrected by applying a linear model with a small validation set. Our method has excellent results on two large datasets with 1,000+ classes (ImageNet ILSVRC 2012 and MS-Celeb-1M), outperforming the state-of-the-art by a large margin (11.1% on ImageNet ILSVRC 2012 and 13.2% on MS-Celeb-1M).

Environment Setup

Words before the code: most codes and experiments are finished in late 2017 and earlier 2018. It is hard to retrieve exact the same environment for experiments, which I remember that the system was in Ubuntu 14. CUDA and tensorflow are all with earlier versions. I re-installed the system several times last year (2018) because some conficts in setting up environment for pytorch, which was original fit for caffe and tensorflow. And also, I upgraded the system form Ubuntu 14 to Ubuntu 16.

The resnet implementation is the official tensorflow official models at:

https://github.com/tensorflow/models

In the latest repo, the most similar implementaion is:

https://github.com/tensorflow/models/blob/master/official/r1/resnet/imagenet_main.py

Unfortunately, we were not able to run our code with the latest tensorflow-2.0 or tensorflow-1.14.

We understand that how important it is to reproduce the results of published papers. I find a compatible tensorflow-1.5 version that is able to run the code with my current CUDA settings. To not mess up with my current software enviroment for pytorch mostly, I use the virtualenv to set up a seperate environment for experiments. The dependency of the code is lite and should be able work in most cases if you get the tensorflow version set up correctly. I summarize my current environment for reference.

System Information
Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial

CUDA version: 9.0
CUDNN version: 7.0.5
GPU: TITAN X (Pascal) 12GB

Python version: 3.5.2
Environemnt is setup using virtualenv.

virtualenv venv
pip install scipy
pip install tensorflow-gpu==1.5

To activate the environment:

source venv/bin/activate

To run the code

Dataset:
We mainly clean the code for ImageNet dataset and leave the other two datasets (CIFAR and MS-Celeb-1M) in future work.

Data preparation:
We have put essential files in the repo so that what you need to do is to download the ImageNet-1000 images. For the images, we used the ILSVRC2016_CLS-LOC.tar.gz file, which should be kept unchanged since 2012.
Some links to handle the 50K validation images to split validation images in class folders.

https://github.com/facebook/fb.resnet.torch/blob/master/INSTALL.md#download-the-imagenet-dataset

After downloading and processing, the structure in the data forlder (dataImageNet100 and dataImageNet1000) should be like:

|--train
|  |--n01440764
|  |--n01443537
|  |.......
|--train.txt
|--val
|  |--n01440764
|  |--n01443537
|  |.......
|--val.txt

Experiments run command:

CUDA_VISIBLE_DEVICES=0 python imagenet_main.py 1>log 2>&1

Results

We run the code once and report the result here. Top-5 accuracy is reported on ImageNet dataset. We observe similar results on ImageNet-100 form this repo with new environment and what we reported in paper. On ImageNet-1000, we notice some performance drop in several first increments but the final results are even better than what we reported in our paper, which might be caused by different operation systems and software environments.

Training on ImageNet-100 takes around 15 hours. Results are:

10 20 30 40 50 60 70 80 90 100
In paper (%) 98.40 96.20 94.00 92.90 91.10 89.40 88.10 86.50 85.40 84.40
This Repo (%) 98.79 96.10 95.06 93.50 90.96 89.73 89.02 87.22 85.97 84.24
Before Bias Correction (%) - 95.70 93.06 90.75 86.60 86.06 83.60 81.25 78.28 76.32
\beta 1.0 0.5900 0.5140 0.4742 0.4839 0.4648 0.4389 0.4297 0.4335 0.3941
\gamma 0.0 -0.4416 -0.5401 -0.4701 -0.5323 -0.4672 -0.4830 -0.5349 -0.5804 -0.4609
Training Samples 12800 14417 14600 14600 14600 14441 14600 14620 14331 14566
Val Samples 200 400 300 240 250 240 210 160 180 200
Test Samples 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

Training on ImageNet-100 takes around 100 hours. Results are:

100 200 300 400 500 600 700 800 900 1000
In paper (%) 94.10 92.50 89.60 89.10 85.70 83.20 80.20 77.50 75.00 73.20
This Repo (%) 93.72 91.46 88.70 86.63 84.64 83.08 81.37 79.82 78.22 76.76
Before Bias Correction (%) - 89.64 84.50 80.84 78.07 74.89 72.66 70.38 67.93 63.34
\beta 1.0 0.7873 0.7382 0.7053 0.6884 0.6704 0.6609 0.6515 0.6239 0.6334
\gamma 0.0 -0.5586 -0.5759 -0.5015 -0.5505 -0.5137 -0.4677 -0.4471 -0.4118 -0.4064
Training Samples 126856 144159 144505 144301 143776 145238 143391 144418 143277 143046
Val Samples 2000 4000 3000 2400 2500 2400 2100 1600 1800 2000
Test Samples 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000

Results are from one run of the model on ImageNet-100 and ImageNet-1000. Log files are located at

./logs/log-ImageNet100
./logs/log-ImageNet1000

Implementation notes

Class Order:
To keep the same order with iCaRL (https://github.com/srebuffi/iCaRL), we use the same random seed (1993) from numpy to generate the order.

Distilling Loss:
We store the previous network for distilling loss.

Bias Correction:
After learning the Bias Correction parameters (\beta and \gamma), classifier after correction is used for the distilling loss in the next incremental training.

Validation Samples from exemplars:
10% selection is limited on exemplars (old classes). Samples from new classes will match the same number of validation samples.

Useful links

Awesome-Incremental-Learning: https://github.com/xialeiliu/Awesome-Incremental-Learning

Contact

If you found any issue of the code, please contact Yue Wu (@wuyuebupt, Email: yuewu@ece.neu.edu or wuyuebupt@gmail.com)

About

License:MIT License


Languages

Language:Python 100.0%