wetliu / energy_ood

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

how to set temperature in Odin

motoight opened this issue · comments

Hi, you really did a nice job in OOD detection!
However, after I've run your code to test each method's performance, I found that Odin show different results in your paper.
My command is:

python test.py --method cifar10_wrn_pretrained --num_to_avg 10 --score Odin

and I got the results is:

Namespace(T=1.0, droprate=0.3, layers=40, load='./snapshots', method_name='cifar10_wrn_pretrained', ngpu=1, noise=0, num_to_avg=10, out_as_pos=False, prefetch=2, score='Odin', test_bs=200, use_xent=False, validate=False, widen_factor=2)
Files already downloaded and verified
Model restored! Epoch: 99
Namespace(T=1.0, droprate=0.3, layers=40, load='./snapshots', method_name='cifar10_wrn_pretrained', ngpu=1, noise=0, num_to_avg=10, out_as_pos=False, prefetch=2, score='Odin', test_bs=200, use_xent=False, validate=False, widen_factor=2)
Files already downloaded and verified
Namespace(T=1.0, droprate=0.3, layers=40, load='./snapshots', method_name='cifar10_wrn_pretrained', ngpu=1, noise=0, num_to_avg=10, out_as_pos=False, prefetch=2, score='Odin', test_bs=200, use_xent=False, validate=False, widen_factor=2)
Files already downloaded and verified
Model restored! Epoch: 99
Namespace(T=1.0, droprate=0.3, layers=40, load='./snapshots', method_name='cifar10_wrn_pretrained', ngpu=1, noise=0, num_to_avg=10, out_as_pos=False, prefetch=2, score='Odin', test_bs=200, use_xent=False, validate=False, widen_factor=2)
Files already downloaded and verified
Model restored! Epoch: 99
Error Rate 5.16

Using CIFAR-10 as typical data
Error Detection
cifar10_wrn_pretrained
FPR95: 25.17
AUROC: 93.36
AUPR: 45.73

Texture Detection
[-0.9999975 -0.99999714 -0.9997482 ] [-0.99997115 -0.7624054 -0.5197353 ]
cifar10_wrn_pretrained
FPR95 AUROC AUPR
& 60.26 & 88.26 & 97.13
& 0.64 & 0.31 & 0.11

SVHN Detection
[-0.9999975 -0.99999714 -0.9997482 ] [-0.8380524 -0.99489635 -0.6921253 ]
cifar10_wrn_pretrained
FPR95 AUROC AUPR
& 48.52 & 92.00 & 98.30
& 0.71 & 0.19 & 0.05

LSUN_C Detection
[-0.9999975 -0.99999714 -0.9997482 ] [-0.95805556 -0.4743406 -0.65227014]
cifar10_wrn_pretrained
FPR95 AUROC AUPR
& 30.73 & 95.70 & 99.14
& 0.86 & 0.10 & 0.02

LSUN_Resize Detection
[-0.9999975 -0.99999714 -0.9997482 ] [-0.5177088 -0.9826832 -0.9692086]
cifar10_wrn_pretrained
FPR95 AUROC AUPR
& 52.30 & 91.47 & 98.16
& 1.21 & 0.12 & 0.04

iSUN Detection
[-0.9999975 -0.99999714 -0.9997482 ] [-0.95436025 -0.9376564 -0.96572554]
cifar10_wrn_pretrained
FPR95 AUROC AUPR
& 56.75 & 89.90 & 97.76
& 0.86 & 0.20 & 0.05

Mean Test Results!!!!!
cifar10_wrn_pretrained
FPR95 AUROC AUPR
& 49.71 & 91.46 & 98.10
I notice that in your default parameters set T=1 and noise=0, I guess the difference is caused by the parameter setting, but I didn't found any instruction in your paper, could you explain that?
Thanks for your help!

Thank you for your interest. The default parameters for Odin is not used to report the results in our paper. Instead, we went through the validation set, pick the best performing hyperparameters through the shell script, and then evaluate the final 6 test sets on those parameters.

thanks, I will try it!

Thank you for your interest. The default parameters for Odin is not used to report the results in our paper. Instead, we went through the validation set, pick the best performing hyperparameters through the shell script, and then evaluate the final 6 test sets on those parameters.

I am wondering how did you choose those hyperparameters in shell script, I noticed that some numbers are not normal ones like 1, 10, 100, but some specific ones like 0.0014.
Thanks for your patience and help!

The temperature is integers, but the noise Odin adds is floating points. You can refer to some of them here: https://github.com/facebookresearch/odin.

Hello! Thank you for your question and answer. However, when I test different hyperparameters, i.e. temperature and noise on out-of-distribution validation datasets, I got the best-performing temperature 1 and noise 0.0008, which means the temperature does not work. I wonder about the hyperparameter tuning process, can you provide the temperature and noise which are used to report ODIN performance in your paper? Thank you!

We just referred to the Odin paper to find out the parameters, not the validation set.

We just referred to the Odin paper to find out the parameters, not the validation set.

Thank you for your response!