NeuralCollapseApplications / FSCIL

I tried to reproduce your work using the provided docker. I tries to train and evaluate on cifar. Because I only have 1 gpu, I edited the config to have 512 sample_per_gpu instead of only 64. then I run this command

bash tools/dist_train.sh configs/cifar/resnet12_etf_bs512_200e_cifar.py 1 --work-dir /opt/logger/cifar_etf --seed 0 --deterministic && bash tools/run_fscil.sh configs/cifar/resnet12_etf_bs512_200e_cifar_eval.py /opt/logger/cifar_etf /opt/logger/cifar_etf/best.pth 1 --seed 0 --deterministic

the result is as follows

2023-04-14 21:32:58,114 - mmcls - INFO - loss1 5.088033676147461 ; loss2 5.087942123413086
2023-04-14 21:32:58,119 - mmcls - INFO - [198/200] Training session : 9 ; lr : 0.00025 ; loss : 5.035999774932861 ; acc@1 : 0.0
2023-04-14 21:32:58,119 - mmcls - INFO - loss1 5.036115646362305 ; loss2 5.035883903503418
2023-04-14 21:32:58,124 - mmcls - INFO - [199/200] Training session : 9 ; lr : 0.00025 ; loss : 5.110080718994141 ; acc@1 : 0.0
2023-04-14 21:32:58,127 - mmcls - INFO - [200/200] Training session : 9 ; lr : 0.00025 ; loss : 5.047082901000977 ; acc@1 : 12.5
2023-04-14 21:32:58,127 - mmcls - INFO - Evaluating session 9, from 0 to 100.
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 10000/10000, 17685.8 task/s, elapsed: 1s, ETA:     0s
2023-04-14 21:32:58,816 - mmcls - INFO - [09]Evaluation results : acc : 1.00 ; acc_base : 0.00 ; acc_inc : 2.50
2023-04-14 21:32:58,816 - mmcls - INFO - [09]Evaluation results : acc_incremental_old : 2.86 ; acc_incremental_new : 0.00
2023-04-14 21:32:58,888 - mmcls - INFO - 82.73 57.63 1.40 1.33 1.25 1.18 1.11 1.05 1.00

the evaluation after session 0 is very low.

Hi @fransiskusyoga ,

Thanks for your interest in our work.

Could you please provide the full log for me to check the details?

Best,
Haobo Yuan

Based on the current information, I guess there might be the possibility that you set the batch_size for

configs/cifar/resnet12_etf_bs512_200e_cifar.py

only, but did not set it for

configs/cifar/resnet12_etf_bs512_200e_cifar_eval.py

since the base session seems ok but incremental sessions are not right.

Based on the current information, I guess there might be the possibility that you set the batch_size for
configs/cifar/resnet12_etf_bs512_200e_cifar.py
only, but did not set it for
configs/cifar/resnet12_etf_bs512_200e_cifar_eval.py
since the base session seems ok but incremental sessions are not right.

I edit neither of them. I Just edit this

configs/_base_/datasets/cifar_fscil.py

mylog.log
this is the evaluation log

@fransiskusyoga I am afraid that the log seems to have nothing.

how about this
mylog.log

Seems the config is right but the loss cannot decrease in the incremental training. Considering that the there will be subtle difference of using 1 gpu or multi gpus, I think you may need to try to adjust the incremental hyperparameters or use 8 gpus to try again.

@fransiskusyoga ,

You may need to set

FSCIL/mmfscil/apis/fscil.py

Line 554 in f89a4ef

samples_per_gpu=8,

to 64 to match the incremental batch size if you insist on using 1 gpu for incremental training.

I am not sure whether there are any other bugs if you use 1 gpu in the incremental training. So, please let me know if you have any other questions.

Hi @fransiskusyoga ,

I would like to close the issue first, feel free to re-open it or raise a new one if you have any other question.

Thanks again for your interests.

Regards,
Haobo Yuan

Low accuracy for session bigger than 1