Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized Tasks

Introduction

This repository contains code for the arXiv preprint "Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized Tasks" and also serves as a standalone benchmark suite for future methods.

The bud repository extends the PyTorch Image Models (timm) code base with

implementations of various uncertainty quantification methods as convenient wrapper classes ... (bud.wrappers)
... and corresponding loss functions (bud.losses)
an extended training loop that supports these methods out of the box (train.py)
a comprehensive evaluation suite for uncertainty quantification methods (validate.py)
support for CIFAR-10 ResNet variants, including Wide ResNets
plotting utilities to recreate the plots of the preprint
scripts to reproduce the results of the preprint

If you found the paper or the code useful in your research, please cite our work as

@article{mucsanyi2024benchmarking,
  title={Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized Tasks},
  author={Mucs{\'a}nyi, B{\'a}lint and Kirchhof, Michael and Oh, Seong Joon},
  journal={arXiv preprint arXiv:2402.19460},
  year={2024}
}

If you use the benchmark, please also cite the datasets it uses.

Installation

Packages

Install a Poetry environment for bud by running poetry install in the root folder. Switch to the environment's shell by running poetry shell.

OOD perturbations use Wand, a Python binding of ImageMagick. Follow these instructions to install ImageMagick. Wand is installed by the poetry install command above.

Datasets

CIFAR-10 is available in torchvision.datasets and is downloaded automatically. A local copy of the ImageNet-1k dataset is needed to run the ImageNet experiments.

The CIFAR-10H test dataset can be downloaded from this link.

The ImageNet-ReaL labels are available in this GitHub repository. The needed files are raters.npz and real.json.

Reproducing Results

We provide scripts that reproduce our results. These are found in the scripts folder for both ImageNet and CIFAR-10 and are named after the respective method.

We also provide access to the exact Singularity container we used in our experiments. The singularity_recipe.rcp file was used to create this container.

To recreate the plots used in the paper, use plots/imagenet/create_imagenet_plots.sh for the main paper's ImageNet results and the individual scripts in the plots folder for all other results (incl. appendix figures). To use these utilities, you have to specify your wandb API key in wandb_key.json.

bmucsanyi / bud