google / uncertainty-baselines

High-quality implementations of standard and SOTA methods on a variety of tasks.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reproducing accuracy

Johswald opened this issue · comments

hey! Can you share a configuration to reproduce the reported results of the BE on the WRN 28-10 on cifar100.
If I run the script as is, accuracies are worse. So this script: https://github.com/google/uncertainty-baselines/blob/master/baselines/cifar/batchensemble.py

Thank you!

Hey @Johswald! Thanks for asking. Have you verified that you're running under 8 accelerators? That's the exact setting in the default flags, which reproduces the result (we've checked on cloud TPUs). This is often the culprit, and may suggest we should somehow raise an error if, e.g., the system's # accelerators doesn't match FLAGS.num_cores.

Hey @dustinvtran
No, I am just running it on a single GPU which is also quite fast. I can try to parallelise on my node with 8 GPUs if you think this would make a difference? Algorithmically I should not right? (No experience with TPUs :) )

I was asking about the config because it was not clear if this is a cifar10 or cifar100 config. So the one that is the default one in batchensemble.py is the cifar100 one that gets the 81.9% test set (on TPUs)? Are you using the same config for cifar10 and 100?
Thank you again!

All the CIFAR uses num_cores=8 currently, which means 8 GPUs. This affects the global batch size for each gradient step. The default flag values are for cifar10 and I think you need to just change FLAGS.dataset for cifar100.

Ah of course, thanks a lot! I will get back to you once the runs are done.

Closing due to inactivity, please reopen if needed!