chenxin061 / pdarts

Codes for our paper "Progressive Differentiable Architecture Search:Bridging the Depth Gap between Search and Evaluation"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

In arch search process , why use SGD for operation weight and Adam for arch_params ?

JarveeLee opened this issue · comments

  1. ADAM enables adaptive learning rate.
  2. We follow previous NAS work to use ADAM optimizer to tune arch_params.