Understanding Deep Architectures by Visual Summaries [1/2]

M. Godi, M. Carletti, M. Aghaei, F. Giuliari, M. Cristani

NOTE

The project consists of two parts. Given a set of images belonging to the same class/category, the former part generates a crisp saliency mask for each image in the set. The second part computes a set of visual summaries starting from the crisp masks.

This is the FIRST part of the project.

You can find HERE the second part of the project concerning the computation of the visual summaries.

Requirements

To generate crisp saliency maps (first part) you need to install the following libraries:

PyTorch and torchvision for Python 3.5
Python 3.5 modules: numpy, cv2
[Optional] To run rank_regions.py and show_regions.py: matplotlib, skimage
[Optional] Download ImageNet

To generate a set of visual summaries (second part) for a specified class you need to follow instructions HERE.

Usage [1/2]: generate crisp masks

Example 1 (single sample):

python3 main.py --modelname alexnet --input_path examples/original/robin3.jpg --dest_folder results/robin3 --results_file results/robin3/results.csv

Example 2 (generic folder):

python3 main.py --modelname alexnet --input_path examples/original --dest_folder results/alexnet_example_original --results_file results/alexnet_example_original/results.csv --file_ext .jpg

Example 3 (images from the same ImageNet class):

For example, consider class robin (class id = 15)

python3 main.py --modelname alexnet --input_path <path_to>/ImageNet/ILSVRC2012_img_train/15 --dest_folder results/alexnet_imagenet --results_file results/alexnet_imagenet/results.csv --file_ext .JPEG --target_id 15 --max_images 50

Usage [2/2]: generate visual summaries

Follow instructions HERE.

About

Code of my paper Understanding Deep Architectures by Interpretable Visual Summaries.

https://arxiv.org/abs/1801.09103v1

attention deep-learning pytorch

GNU General Public License v3.0

Languages

Language:Python 93.3%Language:MATLAB 5.6%Language:Shell 1.0%