Weird results of icarl on Cifar-100
GengDavid opened this issue · comments
Hi, @arthurdouillard
Thanks for your great work! I am trying to use your codes to reproduce the icarl method but the results are not as normal as the results in your paper.
I run the following script:
python3 -minclearn --options options/icarl/icarl_cifar100.yaml options/data/cifar100_3orders.yaml \
--initial-increment 50 --increment 1 --fixed-memory \
--device 0 --label icarl_cnn_cifar100_50steps \
--data-path data
I obtain 44.96, 42.55, 27.76 with three seeds, so that avg= 38.43 +/- 9.32. Am I missing something to reproduce the results? Thanks.
Post the log here for reference.
2021-10-19:23:12:46 [train.py]: Eval on 0->100.
2021-10-19:23:12:49 [train.py]: icarl_cnn_cifar100_50steps
2021-10-19:23:12:49 [train.py]: Avg inc acc: 0.2775882352941177.
2021-10-19:23:12:49 [train.py]: Current acc: {'total': 0.186, '00-09': 0.202, '10-19': 0.184, '20-29': 0.222, '30-39': 0.178, '40-49': 0.219, '50-59': 0.139, '60-69': 0.21, '70-79': 0.162, '80-89': 0.163, '90-99': 0.18}.
2021-10-19:23:12:49 [train.py]: Avg inc acc top5: 0.5672156862745097.
2021-10-19:23:12:49 [train.py]: Current acc top5: {'total': 0.437}.
2021-10-19:23:12:49 [train.py]: Forgetting: 0.47154545454545455.
2021-10-19:23:12:49 [train.py]: Cord metric: 0.26.
2021-10-19:23:12:49 [train.py]: Old accuracy: 0.18, mean: 0.26.
2021-10-19:23:12:49 [train.py]: New accuracy: 0.51, mean: 0.66.
2021-10-19:23:12:49 [train.py]: Average Incremental Accuracy: 0.2775882352941177.
2021-10-19:23:12:49 [train.py]: Training finished in 4317s.
2021-10-19:23:12:49 [train.py]: Label was: icarl_cnn_cifar100_50steps
2021-10-19:23:12:49 [train.py]: Results done on 3 seeds: avg: 38.43 +/- 9.32, last: 28.0 +/- 8.14, forgetting: 41.62 +/- 5.17
2021-10-19:23:12:49 [train.py]: Individual results avg: [44.96, 42.55, 27.76]
2021-10-19:23:12:49 [train.py]: Individual results last: [32.9, 32.5, 18.6]
2021-10-19:23:12:49 [train.py]: Individual results forget: [40.79, 36.92, 47.15]
Hum, your third class order has weirdly low results while the first two seems to correspond to my paper results.
I'm going to run it on my side.
Ok that's very weird. If I run all three orders one after the other, as you did, I got your results.
But If I launching only the third order (just edit the cifar100_3orders.yaml
file to have only the third order and third seed), I've got avg: 45.77, last: 32.8, forgetting: 41.85
so a "normal" result.
I don't really know the solution, but for now, try running the each class order in a different process maybe. I hope that helps!
Thanks for your feedback! I'll run it separately. If I find the reason, I will back to let you know.
Hi, @arthurdouillard
I have another question about your iCaRL implementation. In function build_examplars, it seems that you use all data to reconstruct the exemplar set. For continual learning, I think it is supposed that previous data are not available right? I feel a little confused about this function.
I'm extracting all features because it's simpler that way, but I'm not actually doing any new herding on old data, the line
means that only new data will be sampled.However, old data is still reduced
And note that we only use the selected features and not all features to compute the examplar mean:
Is that clearer?
I think I missed the if
judgment. Now I see the difference.
Thanks for your kind explanation!
My pleasure :)