Performance

Question

Performance

Johswald opened this issue 5 years ago · comments

Johannes Oswald commented 5 years ago

hey again!

when I execute
./main.py --ewc --online --lambda=5000 --gamma=1 --scenario task

this should be close to 99% acc no?

For EWC and SI I get much worse performance with the default values.
What am I doing wrong? Thank you!

Gido van de Ven · Answer 1 · Sat Sep 21 2019 21:04:11 GMT+0800 (China Standard Time)

It's actually indeed the case that for those hyperparameter-values the performance of Online EWC on the split MNIST task protocol is rather bad. This issue has confused me for quite a while as well. It turns out that on the split MNIST protocol, the by the developers recommended (or default) hyperparameter-values of SI and especially of EWC don't work very well. For EWC and Online EWC, lambda even needs to be set several orders of magnitude larger. See also Appendix D and the footnote on page 7 of our paper: https://arxiv.org/pdf/1904.07734.pdf.

Johannes Oswald · Answer 2 · Sat Sep 21 2019 21:23:59 GMT+0800 (China Standard Time)

Thanks for this prompt response - ok I thought that you set the default values to the ones to get your reported accs. Would it be possible to share the calls? Its a bit hard to get the best values out of your hyperparameter search plots. Thanks again for this repo - its really helpfull!

Gido van de Ven · Answer 3 · Sun Sep 22 2019 01:51:47 GMT+0800 (China Standard Time)

Ah yes, sorry. It would indeed have been good to at least report those hyper parameter values somewhere. Here are all the calls with the values we selected:

For split MNIST:

./main.py --scenario=task --xdg=0.95
./main.py --scenario=task --ewc --lambda=10000000
./main.py --scenario=task --ewc --online --lambda=100000000 --gamma=0.8
./main.py --scenario=task --si --c=50

./main.py --scenario=domain --ewc --lambda=1000000
./main.py --scenario=domain --ewc --online --lambda=100000000 --gamma=0.7
./main.py --scenario=domain --si --c=500

./main.py --scenario=class --ewc --lambda=100000000
./main.py --scenario=class --ewc --online --lambda=1000000000 --gamma=0.8
./main.py --scenario=class --si --c=0.5

For permuted MNIST:

./main.py --experiment=permMNIST --tasks=10 --scenario=task --xdg=0.55
./main.py --experiment=permMNIST --tasks=10 --scenario=task --ewc --lambda=500
./main.py --experiment=permMNIST --tasks=10 --scenario=task --ewc --online --lambda=500 --gamma=0.8
./main.py --experiment=permMNIST --tasks=10 --scenario=task --si --c=5

./main.py --experiment=permMNIST --tasks=10 --scenario=domain --ewc --lambda=500
./main.py --experiment=permMNIST --tasks=10 --scenario=domain --ewc --online --lambda=1000 --gamma=0.9
./main.py --experiment=permMNIST --tasks=10 --scenario=domain --si --c=5

./main.py --experiment=permMNIST --tasks=10 --scenario=class --ewc --lambda=1
./main.py --experiment=permMNIST --tasks=10 --scenario=class --ewc --online --lambda=5 --gamma=1
./main.py --experiment=permMNIST --tasks=10 --scenario=class --si --c=0.1

Johannes Oswald · Answer 4 · Sun Sep 22 2019 17:55:57 GMT+0800 (China Standard Time)

Thank you again!

Chongyi Zheng · Answer 5 · Fri Apr 30 2021 21:59:34 GMT+0800 (China Standard Time)

Hello, there @GMvandeVen. I am trying to run EWC and SI experiments with your hyperparameters, but when I use the following commands the average precisions are poor.

./main.py --scenario=class --ewc --lambda=100000000
./main.py --scenario=class --si --c=0.5
./main.py --experiment=permMNIST --tasks=10 --scenario=class --ewc --lambda=1
./main.py --experiment=permMNIST --tasks=10 --scenario=class --si --c=0.1

However, the commands for the task scenario works well.

./main.py --scenario=task --ewc --lambda=10000000
./main.py --scenario=task --si --c=50
./main.py --experiment=permMNIST --tasks=10 --scenario=task --ewc --lambda=500
./main.py --experiment=permMNIST --tasks=10 --scenario=task --si --c=5

Any suggestion?

Gido van de Ven · Answer 6 · Mon May 03 2021 16:58:44 GMT+0800 (China Standard Time)

Hi @YeeCY, thanks for your interest in my code. The observation you describe is correct, the methods EWC and SI actually do not work well with class-incremental learning (--scenario=class), even with their best hyper parameters; while these methods do work reasonably well with task-incremental learning (--scenario=task). See for example this paper (https://arxiv.org/abs/1904.07734) for more details on the difference between these scenarios. Hope this helps!

Chongyi Zheng · Answer 7 · Tue May 04 2021 10:48:25 GMT+0800 (China Standard Time)

Hi @YeeCY, thanks for your interest in my code. The observation you describe is correct, the methods EWC and SI actually do not work well with class-incremental learning (--scenario=class), even with their best hyper parameters; while these methods do work reasonably well with task-incremental learning (--scenario=task). See for example this paper (https://arxiv.org/abs/1904.07734) for more details on the difference between these scenarios. Hope this helps!

Ok, that's a good summary. And I will try to run with task-incremental learning. By the way, would you mind providing best hyperparameters for other algorithms like AGEM?