google-research / tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Clarification is needed in the chapter "How should Adam’s hyperparameters be tuned?"

21kc-caracol opened this issue · comments

Screenshot_20240626-063650

Please clarify if For a budget of 10-25:

  1. First tune the learning rate, then beta1.
  2. (Or) Create a search space for both parameters, then run experiments to find the best combination.
    An example will be appreciated.

I understood that its option 1. First tune for best learning rate, then fix that value, and start tuning beta1.

It’s the second option. If you only have a limited number of trials, you can focus on just tuning the learning rate, but if you have more compute/time, you can optimize beta1 and LR in conjunction.

If you change beta1 (or beta2 etc), you’ll need to retune LR.