This is the official submission repository from team mmasana for the Continual Learning Challenge held in the 4th CLVision Workshop @ CVPR 2023.
The proposed strategy to tackle the challenge scenarios is Horde, which has been developed together as a team by Benedikt Tscheschner, Eduardo Veas and Marc Masana.
The main idea of Horde is to tackle a few different issues from the challenge. Here is how it works:
- we have some heuristics that decide at each experience if we want to learn a feature extractor for the current classes, or instead just try to learn them from the existing representations.
- when learning a new feature extractor, we train it with two heads that are
later discarded. The first one trains a typical CE-loss, while the other trains
a contrastive loss to promote the shape of the class representations to be
inside an n-dimensional ball. We allow a maximum of 10 feature extractors,
although none of the proposed scenarios need to reach that limit. The wrapper
for the different feature extractors does not seem to be larger than 1Gb (the
competition allows up to 4Gb). The CE-loss and the contrastive loss are balanced
with an adaptive strategy (
alpha
argument), which promotes that both losses have a similar energy when backpropagated. - regardless of a feature extractor being trained and added to the ensemble for the current experience, we always train the unified head that takes all the representations of each feature extractor and learns all seen classes.
- to balance that not all classes are seen at any given time, and since rehearsal is not allowed, we keep track of mean and std of each class for each feature extractor. Since all feature extractors are concatenated at their output, we consider each mean and std per class to be the representations that we store (hitting the competition maximum of 200, (mean+std)*100 classes).
- the method has some similarities to Fetril (WACV, 2023), but allowing the learning of multiple feature extractors instead of using a pre-trained backbone (not allowed in this competition), and also adding the usage of the std and the contrastive loss to improve the learned shape of the representations, and thus avoiding the issues for the pseud-feature generation. The proposed pseudo-feature generation strategy allows to apply our strategy to the competition scenarios which contain class repetition.
- we have noticed that we can train each scenario under less than 150min
in our machines (competition restrictions are at 500min). Our proposed strategy
runs with
--num_epochs 20
, but we noticed that running more epochs was usually beneficial. Therefore, if the organization sees fit, we would suggest to run our submission both with20
and with more epochs, if the time restriction allows (e.g.--num_epochs 50
).
This repository implements the proposed Horde method by extending the provided official DevKit.
To make the evaluation of the solution easier, here is a brief description of the main changes:
-
Model: we remove the original linear head, and add some functions that allow to freeze different parts of the model accordingly (i.e. backbone, BN layers). The changes can be seen by checking the difference between the original
models/resnet_18.py
and our proposedmodels/resnet_18_horde.py
. -
Data augmentation: we follow an established data augmentation technique for this dataset type (AutoAugment, CVPR 2019). The transformations used are defined in
strategies/data_augmentation.py
, and for simplicity, we replace the default transformations directly inbenchmarks/cir_benchmark.py
. -
Horde: our proposed strategy, which contains both the feature extractor's model, and the two-phase strategy for training the feature extractors, and for training the single unified head. The extended model, the training of the method, and the definition of the contrastive loss pairs and losses are implemented in
strategies/horde.py
. -
Logging: a modification on the logging and verbosity of the training process is implemented in
utils/facil_logger.py
. The main difference is just an adaptation on the metrics to report adapted to our preferences. -
Train/Main: following the structure of the provided DevKit, the main file is
train.py
, which has just been modified to call the above-mentioned model, strategies and logger.
We have tried to make the code mostly self-explanatory and properly commented. However, in case we have forgotten anything, please feel free to ask.