yahoo / open_nsfw

Not Suitable for Work (NSFW) classification using deep neural network Caffe models.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Msra vs Xavier

ryanjay0 opened this issue · comments

commented

I've noticed the only difference between the default resnet50_1by2 and your implementation (besides the number of classes) is the change from weight_filler mrsa to xavier, and bias filler from constant to xavier in the InnerProduct layer.

Was there a reason for that change? Maybe the small number of classes? Did it make a big difference?

I am assuming default resnet501by2 is the one mentioned here. The initialization while finetuning does not make much difference while finetuning since only the params of last layer (FC_nsfw) are initialized , and rest are loaded from pretrained model. Effect of initialization while training on imagenet is more significant and you can refer to corresponding papers for more details