arthurdouillard / dytox

Dynamic Token Expansion with Continual Transformers, accepted at CVPR 2022

Home Page:

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The avg accuracy on CIFAR100 50steps

jmin0530 opened this issue · comments

Hello, Thank you for your code.
I used the setting of dytox for 50 steps, but I got a different results from your paper.

I ran cli command below

bash 0,1 \
    --options options/data/cifar100_2-2.yaml options/data/cifar100_order1.yaml options/model/cifar_dytox.yaml \
    --name dytox \
    --data-path MY_PATH_TO_DATASET \
    --output-basedir PATH_TO_SAVE_CHECKPOINTS \
    --memory-size 1000

According to your paper, your result on CIFAR-100 50 steps is "Avg acc: 64.82, Last acc: 45.61"
Here is the three CIFAR-100 orders reproduction result:

Also I will show dytox setting to you

DyTox, for CIFAR100

Model definition

model: convit
embed_dim: 384
depth: 6
num_heads: 12
patch_size: 4
input_size: 32
local_up_to_layer: 5
class_attention: true

Training setting

no_amp: true
eval_every: 50

Base hyperparameter

weight_decay: 0.000001
batch_size: 128
incremental_batch_size: 256
incremental_lr: 0.0005
rehearsal: icarl_all

Knowledge Distillation

auto_kd: true


finetuning: balanced
finetuning_epochs: 20
ft_no_sampling: true

Dytox model

dytox: true
freeze_task: [old_task_tokens, old_heads]
freeze_ft: [sab]

Divergence head to get diversity

head_div: 0.1
head_div_mode: tr

Independent Classifiers

ind_clf: 1-1
bce_loss: true

Advanced Augmentations, here disabled


reprob: 0.0
remode: pixel
recount: 1
resplit: false

MixUp & CutMix

mixup: 0.0
cutmix: 0.0

I can't understand why my reproduction results differ from the results you wrote in your paper.
Thank you.


You probably want to use global memory and 2k memory.

If you use distributed memory with 1k, your effective memory size is rather low (much lower than 2k).