khanrc / swad

Official Implementation of SWAD (NeurIPS 2021)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Only 80% of data of target domain is evaluated by default (args.holdout_fraction)

PhoebeChen123 opened this issue · comments

Is such evaluation fair when compared with other SOTA methods where accuracy of all data of target domain is provided?

It is part of a fair evaluation protocol, DomainBed. We can fairly compare the SOTA methods only using the same protocol. If you do not follow the DomainBed protocol, it is not a fair comparison even with the same portion of dataset used (See DomainBed paper about model selection).

In other words -- yes, it is unfair (we have some disadvantages). However, even though you use the full portion of dataset, it is still unfair until you follow the fair comparison protocol, DomainBed.

Got it. Thank you very much!!