Official PyTorch implementation and pretrained models of Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling Is All You Need (MOOD in short). Our paper is accepted by CVPR2023.
Follow official BEiT to setup.
We suggest to organize datasets as following
- MOOD
- data
- cifar10
- cifar-10-batches-py
- cifar100
- cifar-100-python
- imagenet30
- test
- train
- val
- imagenet1k
- test
- train
- val
- $OOD_DATASET
- images
...
In this case, for example, if you want to train on CIFAR-10, set the parameters -- data_path ./data/cifar10 --data_set cifar10
.
We provide datasets/imagenet30.py
for you to create soft link for imagenet30
.
Follow BEiT to pre-train the model or directly utilize the official released weights pretrained on ImageNet-22k. The models were pretrained with 224x224 resolution.
BEiT-base
: #layer=12; hidden=768; FFN factor=4x; #head=12; patch=16x16 (#parameters: 86M)BEiT-large
: #layer=24; hidden=1024; FFN factor=4x; #head=16; patch=16x16 (#parameters: 304M)
Download checkpoints that are self-supervised pretrained and then intermediate fine-tuned on ImageNet-22k (recommended):
- BEiT-base: beit_base_patch16_224_pt22k_ft22k
- BEiT-large: beit_large_patch16_224_pt22k_ft22k
Download checkpoints that are self-supervised pretrained on ImageNet-22k:
- BEiT-base: beit_base_patch16_224_pt22k
- BEiT-large: beit_large_patch16_224_pt22k
For ViT-large,
OMP_NUM_THREADS=1 python -m torch.distributed.launch --nproc_per_node=8 run_class_finetuning.py \
--model beit_large_patch16_224 --data_path $ID_DATA_PATH --data_set $ID_DATASET \
--finetune https://conversationhub.blob.core.windows.net/beit-share-public/beit/beit_large_patch16_224_pt22k.pth \
--batch_size 8 --lr 2e-5 --update_freq 2 \
--warmup_epochs 5 --epochs 100 --layer_decay 0.9 --drop_path 0.4 \
--weight_decay 1e-8 --enable_deepspeed
The hyper-parameters are the same with the official BEiT.
For one-class fine-tuning, please assign a class as in-distribution by adding command '--class_idx $CLASS_IDX'. Others are out-of-distribution. We support three in-distribution datasets, including ['cifar100', 'cifar10' and 'imagenet30']
. Noted that we only fine-tuned one-class imagenet30 in the original paper.
For ViT-large,
OMP_NUM_THREADS=1 python -m torch.distributed.launch --nproc_per_node=8 run_class_finetuning.py \
--model beit_large_patch16_224 --data_path $ID_DATA_PATH --data_set $ID_DATASET \
--finetune https://conversationhub.blob.core.windows.net/beit-share-public/beit/beit_large_patch16_224_pt22k.pth \
--batch_size 8 --lr 2e-5 --update_freq 2 \
--warmup_epochs 5 --epochs 100 --layer_decay 0.9 --drop_path 0.4 \
--weight_decay 1e-8 --enable_deepspeed --class_idx $CLASS_IDX
With OOD detection metric using features, we support ['mahalanobis', 'cos', 'projection', 'gauss', 'kmeans', 'euclidean', 'minkowski', 'chebyshev']
with the following command
python eval_with_features.py --ckpt $CKPT_PATH --data_set $ID_DATASET --ood_dataset $OOD_DATASET --ood_data_path $OOD_DATA_PATH --metric $OOD_METRIC
With OOD detection metric using logits, we support ['softmax', 'entropy', 'energy', 'gradnorm']
with the following command
python eval_with_logits.py --ckpt $CKPT_PATH --data_set $ID_DATASET --ood_dataset $OOD_DATASET --ood_data_path $OOD_DATA_PATH --metric $OOD_METRIC
For one-class OOD detection, please assign a class as in-distribution by adding command '--class_idx $CLASS_IDX'. Others are out-of-distribution. We support three in-distribution datasets, including ['cifar100', 'cifar10' and 'imagenet30']
.
With OOD detection metric using features, we support ['mahalanobis', 'cos', 'projection', 'gauss', 'kmeans', 'euclidean', 'minkowski', 'chebyshev']
with the following command
python eval_with_features.py --ckpt $CKPT_PATH --data_set $ID_DATASET --metric $OOD_METRIC --class_idx $CLASS_IDX
With OOD detection metric using logits, we support ['softmax', 'entropy', 'energy', 'gradnorm']
with the following command
python eval_with_logits.py --ckpt $CKPT_PATH --data_set $ID_DATASET --metric $OOD_METRIC --class_idx $CLASS_IDX
For CIFAR-10,
CIFAR-10 | SVHN | CIFAR-100 | LSUN | Avg |
---|---|---|---|---|
ckpt, distances | 99.8 | 99.4 | 99.9 | 99.7 |
For CIFAR-100,
CIFAR-100 | SVHN | CIFAR-10 | LSUN | Avg |
---|---|---|---|---|
ckpt, distances | 96.5 | 98.3 | 96.3 | 97.0 |
For ImageNet-30,
ImageNet30 | Dogs | Places365 | Flowers102 | Pets | Food | Caltech256 | Dtd | Avg |
---|---|---|---|---|---|---|---|---|
ckpt, distances | 99.40 | 98.90 | 100.00 | 99.10 | 96.60 | 99.50 | 98.9 | 98.9 |
For ImageNet-1k,
ImageNet1k | iNaturalist | SUN | Places | Textures | Average |
---|---|---|---|---|---|
ckpt, distances | 86.9 | 89.8 | 88.5 | 91.3 | 89.1 |
For CIFAR-10,
Method | Airplane | Automobile | Bird | Cat | Dear | Dog | Frog | Horse | Ship | Truck |
---|---|---|---|---|---|---|---|---|---|---|
ours | 98.6 | 99.3 | 94.3 | 93.2 | 98.1 | 96.5 | 99.3 | 99.0 | 98.8 | 97.8 |
For CIFAR-100,
Method | AUROC |
---|---|
ours | 96.4 |
For ImageNet-30,
Method | AUROC |
---|---|
ours | 92.0 |
If you find this repository useful, please consider citing our work:
@misc{https://doi.org/10.48550/arxiv.2302.02615,
doi = {10.48550/ARXIV.2302.02615},
url = {https://arxiv.org/abs/2302.02615},
author = {Li, Jingyao and Chen, Pengguang and Yu, Shaozuo and He, Zexin and Liu, Shu and Jia, Jiaya},
keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling is All You Need},
}
This repository is built using the beit library and the SSD repository.