how to set temperature in Odin

Question

how to set temperature in Odin

motoight opened this issue 3 years ago · comments

Hi, you really did a nice job in OOD detection!
However, after I've run your code to test each method's performance, I found that Odin show different results in your paper.
My command is:

python test.py --method cifar10_wrn_pretrained --num_to_avg 10 --score Odin

and I got the results is:

Namespace(T=1.0, droprate=0.3, layers=40, load='./snapshots', method_name='cifar10_wrn_pretrained', ngpu=1, noise=0, num_to_avg=10, out_as_pos=False, prefetch=2, score='Odin', test_bs=200, use_xent=False, validate=False, widen_factor=2)
Files already downloaded and verified
Model restored! Epoch: 99
Namespace(T=1.0, droprate=0.3, layers=40, load='./snapshots', method_name='cifar10_wrn_pretrained', ngpu=1, noise=0, num_to_avg=10, out_as_pos=False, prefetch=2, score='Odin', test_bs=200, use_xent=False, validate=False, widen_factor=2)
Files already downloaded and verified
Namespace(T=1.0, droprate=0.3, layers=40, load='./snapshots', method_name='cifar10_wrn_pretrained', ngpu=1, noise=0, num_to_avg=10, out_as_pos=False, prefetch=2, score='Odin', test_bs=200, use_xent=False, validate=False, widen_factor=2)
Files already downloaded and verified
Model restored! Epoch: 99
Namespace(T=1.0, droprate=0.3, layers=40, load='./snapshots', method_name='cifar10_wrn_pretrained', ngpu=1, noise=0, num_to_avg=10, out_as_pos=False, prefetch=2, score='Odin', test_bs=200, use_xent=False, validate=False, widen_factor=2)
Files already downloaded and verified
Model restored! Epoch: 99
Error Rate 5.16

Using CIFAR-10 as typical data
Error Detection
cifar10_wrn_pretrained
FPR95: 25.17
AUROC: 93.36
AUPR: 45.73

Texture Detection
[-0.9999975 -0.99999714 -0.9997482 ] [-0.99997115 -0.7624054 -0.5197353 ]
cifar10_wrn_pretrained
FPR95 AUROC AUPR
& 60.26 & 88.26 & 97.13
& 0.64 & 0.31 & 0.11

SVHN Detection
[-0.9999975 -0.99999714 -0.9997482 ] [-0.8380524 -0.99489635 -0.6921253 ]
cifar10_wrn_pretrained
FPR95 AUROC AUPR
& 48.52 & 92.00 & 98.30
& 0.71 & 0.19 & 0.05

LSUN_C Detection
[-0.9999975 -0.99999714 -0.9997482 ] [-0.95805556 -0.4743406 -0.65227014]
cifar10_wrn_pretrained
FPR95 AUROC AUPR
& 30.73 & 95.70 & 99.14
& 0.86 & 0.10 & 0.02

LSUN_Resize Detection
[-0.9999975 -0.99999714 -0.9997482 ] [-0.5177088 -0.9826832 -0.9692086]
cifar10_wrn_pretrained
FPR95 AUROC AUPR
& 52.30 & 91.47 & 98.16
& 1.21 & 0.12 & 0.04

iSUN Detection
[-0.9999975 -0.99999714 -0.9997482 ] [-0.95436025 -0.9376564 -0.96572554]
cifar10_wrn_pretrained
FPR95 AUROC AUPR
& 56.75 & 89.90 & 97.76
& 0.86 & 0.20 & 0.05

Mean Test Results!!!!!
cifar10_wrn_pretrained
FPR95 AUROC AUPR
& 49.71 & 91.46 & 98.10
I notice that in your default parameters set T=1 and noise=0, I guess the difference is caused by the parameter setting, but I didn't found any instruction in your paper, could you explain that?
Thanks for your help!

wetliu · Answer 1 · Wed Sep 29 2021 05:42:40 GMT+0800 (China Standard Time)

Thank you for your interest. The default parameters for Odin is not used to report the results in our paper. Instead, we went through the validation set, pick the best performing hyperparameters through the shell script, and then evaluate the final 6 test sets on those parameters.

motoight · Answer 2 · Wed Sep 29 2021 14:29:56 GMT+0800 (China Standard Time)

thanks, I will try it!

motoight · Answer 3 · Thu Sep 30 2021 22:05:34 GMT+0800 (China Standard Time)

Thank you for your interest. The default parameters for Odin is not used to report the results in our paper. Instead, we went through the validation set, pick the best performing hyperparameters through the shell script, and then evaluate the final 6 test sets on those parameters.

I am wondering how did you choose those hyperparameters in shell script, I noticed that some numbers are not normal ones like 1, 10, 100, but some specific ones like 0.0014.
Thanks for your patience and help!

wetliu · Answer 4 · Fri Oct 01 2021 08:30:56 GMT+0800 (China Standard Time)

The temperature is integers, but the noise Odin adds is floating points. You can refer to some of them here: https://github.com/facebookresearch/odin.

Wenyu Jiang · Answer 5 · Sun Feb 20 2022 11:03:50 GMT+0800 (China Standard Time)

Hello! Thank you for your question and answer. However, when I test different hyperparameters, i.e. temperature and noise on out-of-distribution validation datasets, I got the best-performing temperature 1 and noise 0.0008, which means the temperature does not work. I wonder about the hyperparameter tuning process, can you provide the temperature and noise which are used to report ODIN performance in your paper? Thank you!

wetliu · Answer 6 · Mon Feb 21 2022 02:32:50 GMT+0800 (China Standard Time)

We just referred to the Odin paper to find out the parameters, not the validation set.

Wenyu Jiang · Answer 7 · Mon Feb 21 2022 09:36:50 GMT+0800 (China Standard Time)

We just referred to the Odin paper to find out the parameters, not the validation set.

Thank you for your response!