exiawsh / StreamPETR

[ICCV 2023] StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[EVA02] LearningRateDecayOptimizerConstructor 'decay_type': 'vit_wise'

billbliss3 opened this issue · comments

It seems no 'vit_wise' type in LearningRateDecayOptimizerConstructor functions.

@billbliss3 We didn't release the code for LearningRateDecayOptimizerConstructor. You should implement yourself.

@exiawsh
Hi, I am confuse about the parameter 'weight_decay=1e-7' in vit_wise mode.

Why the 2 version of param have a large diff: weight_decay 1e-2 to 1e-7

optimizer = dict(
    type='AdamW', 
    lr=4e-4, # bs 8: 2e-4 || bs 16: 4e-4
    paramwise_cfg=dict(
        custom_keys={
            'img_backbone': dict(lr_mult=0.1), 
        }),
    weight_decay=0.01)



optimizer = dict(constructor='LearningRateDecayOptimizerConstructor',     
    type='AdamW', 
    lr=1e-4, betas=(0.9, 0.999), weight_decay=1e-7,
    paramwise_cfg={'decay_rate': 0.9,
                'head_decay_rate': 4.0,
                'decay_type': 'vit_wise',
                'num_layers': 24,
                })

@billbliss3 They have similar results. weght decay is not very important.

Hi, could you explain the meaning param head_decay_rate, I am a little confused.
@exiawsh

'head_decay_rate': 4.0

@billbliss3 The learning rate of the detection head (4e-4) is 4x base_learning rate (1e-4).