khanrc / swad

Official Implementation of SWAD (NeurIPS 2021)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Should we repeat HP Search for each trial?

BierOne opened this issue · comments

HI, @khanrc !

Thanks for this outstanding work. I have noticed your previous answer in #3 , which is very helpful!

But I am still unclear about the HP Search process. Since we have many trials for each algorithm, I am wondering whether we should repeat the HP Search for each trial. Could you kindly clarify this?

Also, I would like to confirm that you are utilizing validation performance as the indicator for HP Search, right?

Thank you so much.

Thank you for your interest in our research.

We only conduct the HP search once for the first trial (trial_seed=0) and reuse the HPs for all subsequent trials to reduce computational resources. Also, yes, we use validation performance as the indicator for HP search.

That makes sense! By the way, in your re-implementation of ERM, have you utilized the early-stopping strategy during the training? Because I noticed that SWAD is based on the early stopping, which makes me curious about its effect : )

Following DomainBed protocol, ERM selects the best checkpoint in validation performance during training (without early-stopping). Thus, it does not use early-stopping itself, but it has a similar effect (from the performance perspective).

Thank you!

Hi! I also notice that you are using a smaller CHECKPOINT_FREQ in your datasets. Are you using a smaller CHECKPOINT_FREQ value in your ERM re-implementation (e.g., 200)? I'm attempting to reproduce your results using your suggested best params :) Thanks again.

Yes, we used checkpoint_freq=200, and used checkpoint_freq=1000 for DomainNet.