Larger memory consumption and slower training speed
EBGU opened this issue · comments
Hi!
First, I really appreciate your project, and it is very helpful in my project. However, I was a bit confused when I tried to compare vanilla resnet18 with e2resnet18. I found e2resnet18 consumed 4 times more GPU memory and 10 times training time. I wonder if I had done anything wrong, or it is just designed like that. Thank you very much!
Best,
Harold
Hi @EBGU
Are you referring to this model? e2wrn.py
This is the case if fixparams=True
.
Because equivariance induces a stronger weight sharing, an equivariant model has usually less parameters than an equivalent conventional model.
For this reason, it is common to compare to a scaled up version of the equivariant model which has the same number of parameters of the conventional model. This results in a wider model (roughly by a factor of sqrt(N), where N is the group size).
You can build a non-scaled up model by setting fixparams=False
.
This results in a model with more or less the same size of the conventional architecture.
Note that in our paper we compare the conventional model with both equivariant models above (fixparams=False
and fixparams=True
). The README reports a summary of the results here.
I hope this answers your question
Best,
Gabriele Cesa