aosokin / os2d

Hi! First of all, I really appreciate your work and the high quality of your repository. I've been taking a look at it for some days now and I found it very well organized.

I saw that you made available the checkpoints for Rocco's TransformNet, however, I couldn't find the place where they're loaded when building the OS2D model. Could you point me at a specific folder to look for that checkpoint load?

By the way, I'm currently trying out other backbone architectures for feature extraction, have you experimented with other models besides ResNet?

Thanks!

Hi, thank you for your kind words!

I saw that you made available the checkpoints for Rocco's TransformNet, however, I couldn't find the place where they're loaded when building the OS2D model. Could you point me at a specific folder to look for that checkpoint load?

When initializing the V2 model we loaded the original weights of Rocco et al. with this function:

os2d/os2d/modeling/model.py

Lines 389 to 426 in 96c488b

    
           def init_from_weakalign_model(src_state_dict, feature_extractor=None, affine_regressor=None, tps_regressor=None): 
        
               # init feature extractor - hve three blocks of ResNet101 
        
               layer_prefix_map = {}  # layer_prefix_map[target prefix] = source prefix 
        
               layer_prefix_map["conv1."] = "FeatureExtraction.model.0." 
        
               layer_prefix_map["bn1."] = "FeatureExtraction.model.1." 
        
               for idx in range(3): 
        
                   layer_prefix_map["layer1." + str(idx)] = "FeatureExtraction.model.4." + str(idx) 
        
               for idx in range(4): 
        
                   layer_prefix_map["layer2." + str(idx)] = "FeatureExtraction.model.5." + str(idx) 
        
               for idx in range(23): 
        
                   layer_prefix_map["layer3." + str(idx)] = "FeatureExtraction.model.6." + str(idx) 
        
               if feature_extractor is not None: 
        
                   for k, v in feature_extractor.state_dict().items(): 
        
                       found_init = False 
        
                       for k_map in layer_prefix_map: 
        
                           if k.startswith(k_map): 
        
                               found_init = True 
        
                               break 
        
                       if found_init: 
        
                           k_target = k.replace(k_map, layer_prefix_map[k_map]) 
        
                           if k.endswith("num_batches_tracked"): 
        
                               continue 
        
                           # print("Copying from {0} to {1}, size {2}".format(k_target, k, v.size())) 
        
                           v.copy_(src_state_dict[k_target]) 
        
               for regressor, prefix in zip([affine_regressor, tps_regressor], ["FeatureRegression.", "FeatureRegression2."]): 
        
                   if regressor is not None: 
        
                       for k, v in regressor.state_dict().items(): 
        
                           k_target = prefix + k 
        
                           if k.endswith("num_batches_tracked"): 
        
                               continue 
        
                           # print("Copying from {0} to {1}, size {2}".format(k_target, k, v.size())) 
        
                           if k != "linear.weight": 
        
                               v.copy_(src_state_dict[k_target]) 
        
                           else: 
        
                               # HACK to substitute linear layer with convolution 
        
                               v.copy_(src_state_dict[k_target].view(-1, 64, 5, 5))

When loading our checkpoints the transformer parameters are just regular parameters of a pytorch model. Btw, our V2-init checkpoint contains the weights of Rocco et al. converted to our format. It might be worth comparing those if you are interested in the details.

By the way, I'm currently trying out other backbone architectures for feature extraction, have you experimented with other models besides ResNet?

We have not really tried that, partly because one might need to rerun codes of Rocco et al. (their pretaining on synthetic data) to retrain TransformNet to be compatible with other backbones. But I see not real reason why that shouldn't work.

Best,
Anton

	def init_from_weakalign_model(src_state_dict, feature_extractor=None, affine_regressor=None, tps_regressor=None):
	# init feature extractor - hve three blocks of ResNet101
	layer_prefix_map = {} # layer_prefix_map[target prefix] = source prefix
	layer_prefix_map["conv1."] = "FeatureExtraction.model.0."
	layer_prefix_map["bn1."] = "FeatureExtraction.model.1."
	for idx in range(3):
	layer_prefix_map["layer1." + str(idx)] = "FeatureExtraction.model.4." + str(idx)
	for idx in range(4):
	layer_prefix_map["layer2." + str(idx)] = "FeatureExtraction.model.5." + str(idx)
	for idx in range(23):
	layer_prefix_map["layer3." + str(idx)] = "FeatureExtraction.model.6." + str(idx)

	if feature_extractor is not None:
	for k, v in feature_extractor.state_dict().items():
	found_init = False
	for k_map in layer_prefix_map:
	if k.startswith(k_map):
	found_init = True
	break
	if found_init:
	k_target = k.replace(k_map, layer_prefix_map[k_map])
	if k.endswith("num_batches_tracked"):
	continue
	# print("Copying from {0} to {1}, size {2}".format(k_target, k, v.size()))
	v.copy_(src_state_dict[k_target])

	for regressor, prefix in zip([affine_regressor, tps_regressor], ["FeatureRegression.", "FeatureRegression2."]):
	if regressor is not None:
	for k, v in regressor.state_dict().items():
	k_target = prefix + k
	if k.endswith("num_batches_tracked"):
	continue
	# print("Copying from {0} to {1}, size {2}".format(k_target, k, v.size()))
	if k != "linear.weight":
	v.copy_(src_state_dict[k_target])
	else:
	# HACK to substitute linear layer with convolution
	v.copy_(src_state_dict[k_target].view(-1, 64, 5, 5))

TransformNet checkpoint load