aosokin / os2d

OS2D: One-Stage One-Shot Object Detection by Matching Anchor Features

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TransformNet checkpoint load

JHevia23 opened this issue · comments

Hi! First of all, I really appreciate your work and the high quality of your repository. I've been taking a look at it for some days now and I found it very well organized.

I saw that you made available the checkpoints for Rocco's TransformNet, however, I couldn't find the place where they're loaded when building the OS2D model. Could you point me at a specific folder to look for that checkpoint load?

By the way, I'm currently trying out other backbone architectures for feature extraction, have you experimented with other models besides ResNet?

Thanks!

Hi, thank you for your kind words!

I saw that you made available the checkpoints for Rocco's TransformNet, however, I couldn't find the place where they're loaded when building the OS2D model. Could you point me at a specific folder to look for that checkpoint load?

When initializing the V2 model we loaded the original weights of Rocco et al. with this function:

os2d/os2d/modeling/model.py

Lines 389 to 426 in 96c488b

def init_from_weakalign_model(src_state_dict, feature_extractor=None, affine_regressor=None, tps_regressor=None):
# init feature extractor - hve three blocks of ResNet101
layer_prefix_map = {} # layer_prefix_map[target prefix] = source prefix
layer_prefix_map["conv1."] = "FeatureExtraction.model.0."
layer_prefix_map["bn1."] = "FeatureExtraction.model.1."
for idx in range(3):
layer_prefix_map["layer1." + str(idx)] = "FeatureExtraction.model.4." + str(idx)
for idx in range(4):
layer_prefix_map["layer2." + str(idx)] = "FeatureExtraction.model.5." + str(idx)
for idx in range(23):
layer_prefix_map["layer3." + str(idx)] = "FeatureExtraction.model.6." + str(idx)
if feature_extractor is not None:
for k, v in feature_extractor.state_dict().items():
found_init = False
for k_map in layer_prefix_map:
if k.startswith(k_map):
found_init = True
break
if found_init:
k_target = k.replace(k_map, layer_prefix_map[k_map])
if k.endswith("num_batches_tracked"):
continue
# print("Copying from {0} to {1}, size {2}".format(k_target, k, v.size()))
v.copy_(src_state_dict[k_target])
for regressor, prefix in zip([affine_regressor, tps_regressor], ["FeatureRegression.", "FeatureRegression2."]):
if regressor is not None:
for k, v in regressor.state_dict().items():
k_target = prefix + k
if k.endswith("num_batches_tracked"):
continue
# print("Copying from {0} to {1}, size {2}".format(k_target, k, v.size()))
if k != "linear.weight":
v.copy_(src_state_dict[k_target])
else:
# HACK to substitute linear layer with convolution
v.copy_(src_state_dict[k_target].view(-1, 64, 5, 5))

When loading our checkpoints the transformer parameters are just regular parameters of a pytorch model. Btw, our V2-init checkpoint contains the weights of Rocco et al. converted to our format. It might be worth comparing those if you are interested in the details.

By the way, I'm currently trying out other backbone architectures for feature extraction, have you experimented with other models besides ResNet?

We have not really tried that, partly because one might need to rerun codes of Rocco et al. (their pretaining on synthetic data) to retrain TransformNet to be compatible with other backbones. But I see not real reason why that shouldn't work.

Best,
Anton