Model Initialization

Question

Model Initialization

authman opened this issue 4 years ago · comments

If possible, could you please share the exact initialization parameters passed into ReXNetV1 in order to create ReXNetV1_2.0? The options (and defaults) are:

input_ch=16
final_ch=180
width_mult=1.0
depth_mult=1.0
classes=1000
use_se=True
se_ratio=12
dropout_ratio=0.2
bn_momentum=0.9

My understanding is that width_mult should be set to 2. However doing so and them attempting to load the provided model weights for the -2.0 model results in many unaligned saved weights vs declared model weights. The paper isn't straightforward in providing guidance in this regard either, but that can be resolved easily, I think, the way ResNet and EfficientNet have convenience methods to build each version of their network, e.g.: https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py#L232

authman · Answer 1 · Tue Jul 14 2020 14:59:47 GMT+0800 (China Standard Time)

Actually, this is my mistake. Just realized my input_ch was altered. For now, I'll leave the issue up though, should in case it's decided to create the convenience methods. I think they can still serve a purpose :-).

Dongyoon Han · Answer 2 · Wed Jul 22 2020 17:49:46 GMT+0800 (China Standard Time)

@authman Sorry for your inconvenience. Width_mult and depth_mult are the only tunable hyper-parameters and we provided the pretrained models with respect to width_mult in this version.

Providing convenience methods as those in ResNet would be helpful to use our model. Thanks for the suggestion.