fudan-zvg / SETR

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about optimizer config.

EricKani opened this issue · comments

"paramwise_cfg=dict(custom_keys={'head': dict(lr_mult=10.)}"

Hi, thank you for open-source your code firstly. I have a question about the configuration of the optimizer.
I found there is "decode_head" in your model, not "head" used in 'custom_keys'. Will 'lr_mult=10' takes effect while we training the model?

Thanks~

Got it. I found the corresponding note in DefaultOptimizerConstructor class. Because 'head' is substring of 'decode_head' and 'auxiliary_head', the corresponding parameters will be settled by the 'lr_mult=10'.

  • custom_keys (dict): Specified parameters-wise settings by keys. If
    one of the keys in custom_keys is a substring of the name of one
    parameter, then the setting of the parameter will be specified by
    custom_keys[key] and other setting like bias_lr_mult etc. will
    be ignored. It should be noted that the aforementioned key is the
    longest key that is a substring of the name of the parameter. If there
    are multiple matched keys with the same length, then the key with lower
    alphabet order will be chosen.
    custom_keys[key] should be a dict and may contain fields lr_mult
    and decay_mult. See Example 2 below.

Thanks~

But I found there print nothing when 'recurse=False' (code in DefaultOptimizerConstructor.add_params):

for name, param in module.named_parameters(recurse=False):
print(name)

I also print the content of builded optimizer. There are 363 param_groups in it. And the lr is also modified by 'lr_mult=10'. But I don't know why 'lr_mult=10' takes effect with nothing generated from "for name, param in module.named_parameters(recurse=False):".