muellerzr / Practical-Deep-Learning-for-Coders-2.0

Notebooks for the "A walk with fastai2" Study Group and Lecture Series

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Question] kaiming initialization when pretrained=False

austinmw opened this issue · comments

Hi, thanks for all these tutorials! In your notebook 05_EfficientNet, at the very bottom, I noticed it looks like when you set pretrained=False, you only initialize the head of the model? Am I interpreting correctly how this initialization is applied? And if that's the case, is that the correct way to do it, or should I change model[1] to model? Thanks!

Yes, but we also are using pretrained weights there so it doesn’t matter in the long run (notice we load old weights in), as we don’t train with the uninitialized body, we instead use the body from our other model

I'm confused about the very last code cell in the notebook, but maybe I'm just overtired:

body = create_timm_body('efficientnet_b3a', pretrained=False)
head = create_head(3072, dls.c)
model = nn.Sequential(body, head)
apply_init(model[1], nn.init.kaiming_normal_)
learn = Learner(dls, model, loss_func=LabelSmoothingCrossEntropy(), 
                splitter=default_split, metrics=accuracy)
learn.freeze()
learn.fit_one_cycle(5, 3e-3)

I would think here since the net is being loaded with pretrained=False, that you would use apply_init(model, nn.init.kaiming_normal_) and not freeze the network. I could be missing something though, just trying to check my understanding.

Aha! Totally my fault, my bad :) yes you are right. We probably should be initializing the whole thing there, not just the head. (Along with not freezing) I can try to get to it here in the next few days, but a PR would be more than welcome 😊

No problem, it took me a while to realize while playing with a very non-imagenet-like dataset :) Just glad I was understanding correctly! Will try to make a PR tomorrow.