# Calculate performance measures from the trained model `file_name.pth` located in `save_path`
|---- test.py
| |---- save_path
| |---- file_name.pth
| |---- result.log
python test.py --save_path ./res_cifar10/ --file_name model --model res --data cifar10 --gpu 0
python test.py --save_path ./vgg_svhn/ --file_name model --model vgg --data svhn --gpu 0
Results
Performance measures
Accuracy
AURC, EAURC
Expected Calibration Error(ECE)
Negative Log Likelihood(NLL)
Brier Score
AUPR Error, FPR 95% TPR
Results on CIFAR-100
Architecture
Dataset
Model
ACC
AURC
AUPR
FPR
ECE
NLL
PreActResNet110
CIFAR100
Baseline
73.32
86.54
65.37
66.42
16.39
14.93
PreActResNet110
CIFAR100
CRL-softmax
74.34
72.35
68.13
61.30
11.45
10.86
DenseNet_BC
CIFAR100
Baseline
75.13
72.40
66.41
62.85
12.94
11.59
DenseNet_BC
CIFAR100
CRL-softmax
76.75
62.71
65.87
60.22
8.66
9.12
VGG16
CIFAR100
Baseline
73.62
77.80
68.11
62.21
19.95
18.35
VGG16
CIFAR100
CRL-softmax
73.84
71.98
71.04
59.06
13.92
13.03
More results can be found in the paper.
Citation
@inproceedings{moon2020crl,
title={Confidence-Aware Learning for Deep Neural Networks},
author={Moon, Jooyoung and Kim, Jihyo and Shin, Younghak and Hwang, Sangheum},
booktitle={International Conference on Machine Learning},
year={2020}
}