donghao51 / MultiOOD

Scaling Out-of-Distribution Detection for Multiple Modalities

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities

1ETH Zurich, 2University of Southern California, 3EPFL


MultiOOD is the first-of-its-kind benchmark for Multimodal OOD Detection, characterized by diverse dataset sizes and varying modality combinations.

MultiOOD Benchmark

MultiOOD is based on five public action recognition datasets (HMDB51, UCF101, EPIC-Kitchens, HAC, and Kinetics-600).

Prepare Datasets

  1. Download HMDB51 video data from link and extract. Download HMDB51 optical flow data from link and extract. The directory structure should be modified to match:
Click for details...
├── video
|   ├── catch
|   |   ├── *.avi
|   ├── climb
|   |   ├── *.avi
|   |── ...

├── flow
|   ├── *_flow_x.mp4
|   ├── *_flow_y.mp4
|   ├── ...
  1. Download UCF101 video data from link and extract. Download UCF101 optical flow data from link and extract. The directory structure should be modified to match:
Click for details...
├── video
|   ├── *.avi
|   |── ...

├── flow
|   ├── *_flow_x.mp4
|   ├── *_flow_y.mp4
|   ├── ...
  1. Download EPIC-Kitchens video and optical flow data by
bash utils/ 

Download audio data from link.

Unzip all files and the directory structure should be modified to match:

Click for details...
├── rgb
|   ├── train
|   |   ├── D3
|   |   |   ├── P22_01.wav
|   |   |   ├── P22_01
|   |   |   |     ├── frame_0000000000.jpg
|   |   |   |     ├── ...
|   |   |   ├── P22_02
|   |   |   ├── ...
|   ├── test
|   |   ├── D3

├── flow
|   ├── train
|   |   ├── D3
|   |   |   ├── P22_01
|   |   |   |     ├── frame_0000000000.jpg
|   |   |   |     ├── ...
|   |   |   ├── P22_02
|   |   |   ├── ...
|   ├── test
|   |   ├── D3
  1. Download HAC video, audio and optical flow data from link and extract. The directory structure should be modified to match:
Click for details...
├── human
|   ├── videos
|   |   ├── ...
|   ├── flow
|   |   ├── ...
|   ├── audio
|   |   ├── ...

├── animal
|   ├── videos
|   |   ├── ...
|   ├── flow
|   |   ├── ...
|   ├── audio
|   |   ├── ...

├── cartoon
|   ├── videos
|   |   ├── ...
|   ├── flow
|   |   ├── ...
|   ├── audio
|   |   ├── ...
  1. Download Kinetics-600 video data by
wget -i utils/filtered_k600_train_path.txt

Extract all files and get audio data from video data by

python utils/

Download Kinetics-600 optical flow data (kinetics600_flow_mp4_part_*) from link and extract (run cat kinetics600_flow_mp4_part_* > kinetics600_flow_mp4.tar.gz and then tar -zxvf kinetics600_flow_mp4.tar.gz).

Unzip all files and the directory structure should be modified to match:

Click for details...
├── video
|   ├── acting in play
|   |   ├── *.mp4
|   |   ├── *.wav
|   |── ...

├── flow
|   ├── acting in play
|   |   ├── *_flow_x.mp4
|   |   ├── *_flow_y.mp4
|   ├── ...

Dataset Splits

The splits for Multimodal Near-OOD and Far-OOD Benchmarks are provided under HMDB-rgb-flow/splits/ for HMDB51, UCF101, HAC, and Kinetics-600, and under EPIC-rgb-flow/splits/ for EPIC-Kitchens.


An overview of the proposed framework for Multimodal OOD Detection. We introduce A2D algorithm to encourage enlarging the prediction discrepancy across modalities. Additionally, we propose a novel outlier synthesis algorithm, NP-Mix, designed to explore broader feature spaces, which complements A2D to strengthen the OOD detection performance.


The code was tested using Python 3.10.4, torch 1.11.0+cu113 and NVIDIA GeForce RTX 3090. More dependencies are in requirement.txt.


Download Pretrained Weights

  1. Download SlowFast model for RGB modality link and place under the HMDB-rgb-flow/pretrained_models and EPIC-rgb-flow/pretrained_models directory

  2. Download SlowOnly model for Flow modality link and place under the HMDB-rgb-flow/pretrained_models and EPIC-rgb-flow/pretrained_models directory

  3. Download Audio model link, rename it as vggsound_avgpool.pth.tar and place under the HMDB-rgb-flow/pretrained_models and EPIC-rgb-flow/pretrained_models directory

Multimodal Near-OOD Benchmark

HMDB51 25/26

Click for details...
cd HMDB-rgb-flow/

Train the Near-OOD baseline model for HMDB:

python --near_ood --dataset 'HMDB' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --nepochs 50 --appen '' --save_best --save_checkpoint --datapath '/path/to/HMDB51/' 

Train the Near-OOD model using A2D for HMDB:

python --near_ood --dataset 'HMDB' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --start_epoch 10 --use_single_pred --use_a2d --a2d_max_hellinger --a2d_ratio 1.0 --nepochs 50 --appen '' --save_best --save_checkpoint --datapath '/path/to/HMDB51/' 

Train the Near-OOD model using A2D and NP-Mix for HMDB:

python --near_ood --dataset 'HMDB' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --start_epoch 10 --use_single_pred --use_a2d --a2d_max_hellinger --a2d_ratio 0.5 --use_npmix --max_ood_hellinger --a2d_ratio_ood 0.5 --ood_entropy_ratio 0.5 --nepochs 50 --appen '' --save_best --save_checkpoint --datapath '/path/to/HMDB51/' 

You can also download our provided checkpoints (,, and from link.

Save the evaluation files for HMDB (to save evaluation files for ASH or ReAct, you should also run following line with options --use_ash or --use_react):

python --bsz 16 --num_workers 2 --near_ood --dataset 'HMDB' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Evaluation for HMDB (change --postprocessor to different score functions):

python --postprocessor msp --appen 'a2d_npmix_best_' --dataset 'HMDB' --path 'HMDB-rgb-flow/'

UCF101 50/51

Click for details...
cd HMDB-rgb-flow/

Train the Near-OOD baseline model for UCF:

python --near_ood --dataset 'UCF' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --nepochs 50 --appen '' --save_best --save_checkpoint --datapath '/path/to/UCF101/' 

Train the Near-OOD model using A2D for UCF:

python --near_ood --dataset 'UCF' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --start_epoch 10 --use_single_pred --use_a2d --a2d_max_hellinger --a2d_ratio 1.0 --nepochs 50 --appen '' --save_best --save_checkpoint --datapath '/path/to/UCF101/' 

Train the Near-OOD model using A2D and NP-Mix for UCF:

python --near_ood --dataset 'UCF' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --start_epoch 10 --use_single_pred --use_a2d --a2d_max_hellinger --a2d_ratio 0.5 --use_npmix --max_ood_hellinger --a2d_ratio_ood 0.5 --ood_entropy_ratio 0.5 --nepochs 50 --appen '' --save_best --save_checkpoint --datapath '/path/to/UCF101/' 

You can also download our provided checkpoints (,, and from link.

Save the evaluation files for UCF (to save evaluation files for ASH or ReAct, you should also run following line with options --use_ash or --use_react):

python --bsz 16 --num_workers 2 --near_ood --dataset 'UCF' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Evaluation for UCF (change --postprocessor to different score functions):

python --postprocessor msp --appen 'a2d_npmix_best_' --dataset 'UCF' --path 'HMDB-rgb-flow/'

EPIC-Kitchens 4/4

Click for details...
cd EPIC-rgb-flow/

Train the Near-OOD baseline model for EPIC:

python --dataset 'EPIC' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --nepochs 20 --appen '' --save_best --save_checkpoint --datapath '/path/to/EPIC-Kitchens/' 

Train the Near-OOD model using A2D for EPIC:

python --dataset 'EPIC' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --start_epoch 10 --use_single_pred --use_a2d --a2d_max_hellinger --a2d_ratio 1.0 --nepochs 50 --appen '' --save_best --save_checkpoint --datapath '/path/to/EPIC-Kitchens/' 

Train the Near-OOD model using A2D and NP-Mix for EPIC:

python --dataset 'EPIC' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --start_epoch 10 --use_single_pred --use_a2d --a2d_max_hellinger --a2d_ratio 0.1 --use_npmix --max_ood_hellinger --a2d_ratio_ood 0.1 --ood_entropy_ratio 0.1 --nepochs 20 --appen '' --save_best --save_checkpoint --datapath '/path/to/EPIC-Kitchens/' 

You can also download our provided checkpoints (,, and from link.

Save the evaluation files for EPIC (to save evaluation files for ASH or ReAct, you should also run following line with options --use_ash or --use_react):

python --bsz 16 --num_workers 2  --ood_dataset 'EPIC' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Evaluation for EPIC (change --postprocessor to different score functions):

python --postprocessor msp --appen 'a2d_npmix_best_' --dataset 'EPIC' --path 'EPIC-rgb-flow/'

Kinetics-600 129/100

Click for details...
cd HMDB-rgb-flow/

Train the Near-OOD baseline model for Kinetics:

python --near_ood --dataset 'Kinetics' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --nepochs 20 --appen '' --save_best --save_checkpoint --datapath '/path/to/Kinetics-600/' 

Train the Near-OOD model using A2D for Kinetics:

python --near_ood --dataset 'Kinetics' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --start_epoch 10 --use_single_pred --use_a2d --a2d_max_hellinger --a2d_ratio 1.0 --nepochs 20 --appen '' --save_best --save_checkpoint --datapath '/path/to/Kinetics-600/' 

Train the Near-OOD model using A2D and NP-Mix for Kinetics:

python --near_ood --dataset 'Kinetics' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --start_epoch 10 --use_single_pred --use_a2d --a2d_max_hellinger --a2d_ratio 0.1 --use_npmix --max_ood_hellinger --a2d_ratio_ood 0.1 --ood_entropy_ratio 0.1 --nepochs 20 --appen '' --save_best --save_checkpoint --datapath '/path/to/Kinetics-600/' 

You can also download our provided checkpoints (,, and from link.

Save the evaluation files for Kinetics (to save evaluation files for ASH or ReAct, you should also run following line with options --use_ash or --use_react):

python --bsz 16 --num_workers 2 --near_ood --dataset 'Kinetics' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Evaluation for Kinetics (change --postprocessor to different score functions):

python --postprocessor msp --appen 'a2d_npmix_best_' --dataset 'Kinetics' --path 'HMDB-rgb-flow/'

Multimodal Far-OOD Benchmark

HMDB51 as ID

Click for details...
cd HMDB-rgb-flow/

Train the Far-OOD baseline model for HMDB:

python --dataset 'HMDB' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --nepochs 50 --appen '' --save_best --save_checkpoint --datapath '/path/to/HMDB51/' 

Train the Far-OOD model using A2D and NP-Mix for HMDB:

python --dataset 'HMDB' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --start_epoch 10 --use_single_pred --use_a2d --a2d_max_hellinger --a2d_ratio 0.1 --use_npmix --max_ood_hellinger --a2d_ratio_ood 0.1 --ood_entropy_ratio 0.1 --nepochs 50 --appen '' --save_best --save_checkpoint --datapath '/path/to/HMDB51/' 

You can also download our provided checkpoints ( and from link.

Save the evaluation files for HMDB (to save evaluation files for ASH or ReAct, you should also run following line with options --use_ash or --use_react, same for other datasets):

python --bsz 16 --num_workers 2 --dataset 'HMDB' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Save the evaluation files for UCF:

python --bsz 16 --num_workers 2 --far_ood --dataset 'HMDB' --ood_dataset 'UCF' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Save the evaluation files for HAC:

python --bsz 16 --num_workers 2 --far_ood --dataset 'HMDB' --ood_dataset 'HAC' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Save the evaluation files for Kinetics:

python --bsz 16 --num_workers 2 --far_ood --dataset 'HMDB' --ood_dataset 'Kinetics' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Save the evaluation files for EPIC:

cd EPIC-rgb-flow/
python --bsz 16 --num_workers 2 --far_ood --dataset 'HMDB' --ood_dataset 'EPIC' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Evaluation for UCF (change --postprocessor to different score functions, change --ood_dataset to UCF, EPIC, HAC, or Kinetics):

python --postprocessor msp --appen 'a2d_npmix_best_' --dataset 'HMDB' --ood_dataset 'UCF' --path 'HMDB-rgb-flow/'

Kinetics as ID

Click for details...
cd HMDB-rgb-flow/

Train the Far-OOD baseline model for Kinetics:

python --dataset 'Kinetics' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --nepochs 20 --appen '' --save_best --save_checkpoint --datapath '/path/to/Kinetics-600/' 

Train the Far-OOD model using A2D and NP-Mix for Kinetics:

python --dataset 'Kinetics' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --start_epoch 10 --use_single_pred --use_a2d --a2d_max_hellinger --a2d_ratio 0.1 --use_npmix --max_ood_hellinger --a2d_ratio_ood 0.1 --ood_entropy_ratio 0.1 --nepochs 20 --appen '' --save_best --save_checkpoint --datapath '/path/to/Kinetics-600/' 

You can also download our provided checkpoints ( and from link.

Save the evaluation files for Kinetics (to save evaluation files for ASH or ReAct, you should also run following line with options --use_ash or --use_react, same for other datasets):

python --bsz 16 --num_workers 2 --dataset 'Kinetics' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Save the evaluation files for HMDB:

python --bsz 16 --num_workers 2 --far_ood --dataset 'Kinetics' --ood_dataset 'HMDB' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Save the evaluation files for UCF:

python --bsz 16 --num_workers 2 --far_ood --dataset 'Kinetics' --ood_dataset 'UCF' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Save the evaluation files for HAC:

python --bsz 16 --num_workers 2 --far_ood --dataset 'Kinetics' --ood_dataset 'HAC' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Save the evaluation files for EPIC:

cd EPIC-rgb-flow/
python --bsz 16 --num_workers 2 --far_ood --dataset 'Kinetics' --ood_dataset 'EPIC' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Evaluation for UCF (change --postprocessor to different score functions, change --ood_dataset to UCF, EPIC, HAC, or HMDB):

python --postprocessor msp --appen 'a2d_npmix_best_' --dataset 'Kinetics' --ood_dataset 'UCF' --path 'HMDB-rgb-flow/'

Multimodal Near-OOD Benchmark with Video, Audio, and Optical Flow

EPIC-Kitchens 4/4

Click for details...
cd EPIC-rgb-flow/

Train the Near-OOD baseline model for EPIC:

python --dataset 'EPIC' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --nepochs 20 --appen '' --save_best --save_checkpoint --datapath '/path/to/EPIC-Kitchens/' 

Train the Near-OOD model using A2D and NP-Mix for EPIC:

python --dataset 'EPIC' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --start_epoch 10 --use_single_pred --use_a2d --a2d_max_hellinger --a2d_ratio 0.5 --use_npmix --max_ood_hellinger --a2d_ratio_ood 0.5 --ood_entropy_ratio 0.5 --nepochs 20 --appen '' --save_best --save_checkpoint --datapath '/path/to/EPIC-Kitchens/' 

You can also download our provided checkpoints ( and from link.

Save the evaluation files for EPIC (to save evaluation files for ASH or ReAct, you should also run following line with options --use_ash or --use_react):

python --bsz 16 --num_workers 2  --ood_dataset 'EPIC' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Evaluation for EPIC (change --postprocessor to different score functions):

python --postprocessor msp --appen 'vfa_a2d_npmix_best_' --dataset 'EPIC' --path 'EPIC-rgb-flow/'

Kinetics-600 129/100

Click for details...
cd HMDB-rgb-flow/

Train the Near-OOD baseline model for Kinetics:

python --near_ood --dataset 'Kinetics' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --nepochs 20 --appen '' --save_best --save_checkpoint --datapath '/path/to/Kinetics-600/' 

Train the Near-OOD model using A2D and NP-Mix for Kinetics:

python --near_ood --dataset 'Kinetics' --lr 0.0001 --seed 0 --bsz 16 --num_workers 10 --start_epoch 10 --use_single_pred --use_a2d --a2d_max_hellinger --a2d_ratio 0.5 --use_npmix --max_ood_hellinger --a2d_ratio_ood 0.5 --ood_entropy_ratio 0.5 --nepochs 20 --appen '' --save_best --save_checkpoint --datapath '/path/to/Kinetics-600/' 

You can also download our provided checkpoints ( and from link.

Save the evaluation files for Kinetics (to save evaluation files for ASH or ReAct, you should also run following line with options --use_ash or --use_react):

python --bsz 16 --num_workers 2 --near_ood --dataset 'Kinetics' --appen 'a2d_npmix_best_' --resumef '/path/to/'

Evaluation for Kinetics (change --postprocessor to different score functions):

python --postprocessor msp --appen 'vfa_a2d_npmix_best_' --dataset 'Kinetics' --path 'HMDB-rgb-flow/'


If you have any questions, please send an email to


If you find our work useful in your research please consider citing our paper:

	author   = {Hao Dong and Yue Zhao and Eleni Chatzi and Olga Fink},
	title    = {{MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities}},
    journal  = {arXiv preprint arXiv:2405.17419},
	year     = {2024},

Related Projects

NNG-Mix: Improving Semi-supervised Anomaly Detection with Pseudo-anomaly Generation

SimMMDG: A Simple and Effective Framework for Multi-modal Domain Generalization

MOOSA: Towards Multimodal Open-Set Domain Generalization and Adaptation through Self-supervision


Many thanks to the excellent open-source projects SimMMDG and OpenOOD.


Scaling Out-of-Distribution Detection for Multiple Modalities


Language:Python 99.7%Language:Shell 0.3%