cnn deep-learning explainability ms-coco pytorch resnet visual-genome

LaViSE Reproduction

This is an attempt to reproduce the results and findings of LaViSE: Explaining Deep Convolutional Neural Networks via Latent Visual-Semantic Filter Attention.

Requirements

Environment

The environment necessary to run the code can be created with the environment_lisa.yml file when running on lisa cluster and the environment_windesk when running locally on windows.

Data

To install the data, you can run the install_data.sh script. Alternatively, you can install them manually by following the instructions on Visual Genome Python Driver and COCO API WRAPPER. The wrappers and api are already installed in the environment.

To format the Visual Genome object, run the notebook.

Running the model

There are two job scripts available to run the code. The train_explainer.job file trains with a resnet18 model with Visual Genome as reference dataset. The infer_filter.job file contains several experiments that test different parameters, most of which have been commented out.

Alternatively, you can manually run

python train_explainer.py --refer <reference_dataset> --epochs <number_of_epochs> --anno_rate <annotation> --name <run_name>

on your desktop to train the explainer, and

python infer_filter.py --refer <reference_dataset> --anno_rate <annotation> --name <run_name>

to visualize the filters. Note that both parts should have matching hyperparameters.

Original README of LaViSE page

This is the official repository for paper "Explaining Deep Convolutional Neural Networks via Unsupervised Visual-Semantic Filter Attention" to appear in CVPR 2022.

Authors: Yu Yang, Seungbae Kim, Jungseock Joo

Datasets

Common Objects in Context (COCO)

Please follow the instructions in the COCO API README and here to download and setup the COCO data.

Visual Genome (VG)

Please follow the instructions in the README of the python wrapper for the Visual Genome API and here.

GloVe

We load the pretrained GloVe word embeddings directly from the torchtext library.

Social Media Photographs of US Politicians (PoP)

The list of entities used to discover new concepts is provided in data/entities.txt.

Getting started

Requirements

Required packages can be found in requirements.txt.

Usage

Train an explainer with

python train_explainer.py

Explain a target filter of any model with

python infer_filter.py

More features will be added soon! 🍻

Citation

@inproceedings{yang2022explaining,
    author    = {Yang, Yu and Kim, Seungbae and Joo, Jungseock},
    title     = {Explaining Deep Convolutional Neural Networks via Unsupervised Visual-Semantic Filter Attention},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2022},
}

About

Reproduction of LaVisE: Explaining Deep Convolutional Neural Networks via Latent Visual-Semantic Filter Attention

cnn deep-learning explainability ms-coco pytorch resnet visual-genome

Languages

Language:Python 81.1%Language:Jupyter Notebook 12.9%Language:Shell 6.0%