aywi / mprotonet

MProtoNet: A Case-Based Interpretable Model for Brain Tumor Classification with 3D Multi-parametric Magnetic Resonance Imaging

Home Page:https://openreview.net/forum?id=6Wbj3QCo4U4

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About the differences on the training strategies when "_pt" in model name or not.

xiaovhua opened this issue · comments

Thank you so much for your prominent work in your MProtoNet, and sincerely thank you for your code available on Github too.

Recently, we have some doubts about the usage of (a) "_pt" in model name and (b) best_grid["fixed"] in tumor_cls.py.

In the commands you supply in the repository, the model names all end with "_pmX". Therefore, according to the code, MProtoNet will update in mode "joint" and "last_layer" (every 10 epochs) while training. However, we notice that when "_pt" is in the model name and best_grid["fixed"]=False, there is also another branch, where the net.features will not update at the begging, but start to optimize when current epoch >= wu_e. We also notice that the vanilla ProtoPNet updates in the latter way (fix the net.features at the begging).

Will there be a significant difference between two kinds of training strategies? Hope to receive your kindest reply!

It is kind of our fault for not cleaning up these code, since these options are defined for models that are deleted in the final paper.

  1. "_pt" means "pre-trained" models that we tested before, which are 2D & 3D mixed models where their 2D layers are pre-trained with Imagenet1k. Since we test pure 3D models in the final paper, these models are deleted for fair comparisons. You can also see that in the final experiments we choose feature layers all with "_ri", which means "randomly initialized" (pre-trained versions are ignored since we only use these 2D backbones from torchvision as sketches to build 3D feature layers):

    mprotonet/src/models.py

    Lines 27 to 47 in 4565e22

    def features_imagenet1k(features):
    if features == 'resnet18':
    return build_resnet_features(vision_models.resnet18(weights='IMAGENET1K_V1'))
    elif features == 'resnet18_ri':
    return build_resnet_features(vision_models.resnet18())
    elif features == 'resnet34':
    return build_resnet_features(vision_models.resnet34(weights='IMAGENET1K_V1'))
    elif features == 'resnet34_ri':
    return build_resnet_features(vision_models.resnet34())
    elif features == 'resnet50':
    return build_resnet_features(vision_models.resnet50(weights='IMAGENET1K_V2'))
    elif features == 'resnet50_ri':
    return build_resnet_features(vision_models.resnet50())
    elif features == 'resnet101':
    return build_resnet_features(vision_models.resnet101(weights='IMAGENET1K_V2'))
    elif features == 'resnet101_ri':
    return build_resnet_features(vision_models.resnet101())
    elif features == 'resnet152':
    return build_resnet_features(vision_models.resnet152(weights='IMAGENET1K_V2'))
    elif features == 'resnet152_ri':
    return build_resnet_features(vision_models.resnet152())
  2. "fixed" is a much older option when I tested the pre-trained models without a fixed training period in the beginning (yes, the correct name should be "not fixed").
  3. Vanilla ProtoPNet has a fixed training period because it is a 2D model pre-trained with Imagenet1k. Since we test pure 3D models that are randomly initialized in the final paper, this period becomes useless and only wastes training time.

So, in a word, just ignore them. I will add a commit later to remove these code.

It is kind of our fault for not cleaning up these code, since these options are defined for models that are deleted in the final paper.

  1. "_pt" means "pre-trained" models that we tested before, which are 2D & 3D mixed models where their 2D layers are pre-trained with Imagenet1k. Since we test pure 3D models in the final paper, these models are deleted for fair comparisons. You can also see that in the final experiments we choose feature layers all with "_ri", which means "randomly initialized" (pre-trained versions are ignored since we only use these 2D backbones from torchvision as sketches to build 3D feature layers):

    mprotonet/src/models.py

    Lines 27 to 47 in 4565e22

    def features_imagenet1k(features):
    if features == 'resnet18':
    return build_resnet_features(vision_models.resnet18(weights='IMAGENET1K_V1'))
    elif features == 'resnet18_ri':
    return build_resnet_features(vision_models.resnet18())
    elif features == 'resnet34':
    return build_resnet_features(vision_models.resnet34(weights='IMAGENET1K_V1'))
    elif features == 'resnet34_ri':
    return build_resnet_features(vision_models.resnet34())
    elif features == 'resnet50':
    return build_resnet_features(vision_models.resnet50(weights='IMAGENET1K_V2'))
    elif features == 'resnet50_ri':
    return build_resnet_features(vision_models.resnet50())
    elif features == 'resnet101':
    return build_resnet_features(vision_models.resnet101(weights='IMAGENET1K_V2'))
    elif features == 'resnet101_ri':
    return build_resnet_features(vision_models.resnet101())
    elif features == 'resnet152':
    return build_resnet_features(vision_models.resnet152(weights='IMAGENET1K_V2'))
    elif features == 'resnet152_ri':
    return build_resnet_features(vision_models.resnet152())
  2. "fixed" is a much older option when I tested the pre-trained models without a fixed training period in the beginning (yes, the correct name should be "not fixed").
  3. Vanilla ProtoPNet has a fixed training period because it is a 2D model pre-trained with Imagenet1k. Since we test pure 3D models that are randomly initialized in the final paper, this period becomes useless and only wastes training time.

So, in a word, just ignore them. I will add a commit later to remove these code.

I get it! Thank you so much!