Model Initialization
authman opened this issue · comments
If possible, could you please share the exact initialization parameters passed into ReXNetV1
in order to create ReXNetV1_2.0? The options (and defaults) are:
- input_ch=16
- final_ch=180
- width_mult=1.0
- depth_mult=1.0
- classes=1000
- use_se=True
- se_ratio=12
- dropout_ratio=0.2
- bn_momentum=0.9
My understanding is that width_mult should be set to 2. However doing so and them attempting to load the provided model weights for the -2.0 model results in many unaligned saved weights vs declared model weights. The paper isn't straightforward in providing guidance in this regard either, but that can be resolved easily, I think, the way ResNet and EfficientNet have convenience methods to build each version of their network, e.g.: https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py#L232
Actually, this is my mistake. Just realized my input_ch
was altered. For now, I'll leave the issue up though, should in case it's decided to create the convenience methods. I think they can still serve a purpose :-).
@authman Sorry for your inconvenience. Width_mult and depth_mult are the only tunable hyper-parameters and we provided the pretrained models with respect to width_mult in this version.
Providing convenience methods as those in ResNet would be helpful to use our model. Thanks for the suggestion.