LayerNorm eps value
guglielmocamporese opened this issue · comments
Guglielmo Camporese commented
Hi!
thanks for this little piece of juicy code!
Just for curiosity, I've noticed that in your implementation you are using nn.LayerNorm
with the standard denominator constant eps=1e-5
, whereas in other implementations (DINO
[here] and ViT
in timm
[here]) this parameter is explicitly set to eps=1e-6
.
I know that it is a small detail, but details sometimes are super-important for having better models.
Do you think the model is sensitive to this kind of parameter change? Have you ever tried/noticed it?
Thanks!