Code for metrics and SMILER experiment files used in the paper Kotseruba et al. "Do Saliency Models Detect Odd-One-Out Targets? New Datasets and Evaluations", BMVC, 2019.
Computing metrics
In our paper we propose global saliency index (GSI) and number of fixations to find the target to measure the performance of the saliency algorithms on stimuli in the Psychophysical Patterns (P3) dataset.
For the Odd-One-Out (O3) dataset we compute the ratios of maximum saliency values within the target and the distractors (MSRtarg) and within the target and the background (MSRbg) areas. These metrics measure how well the algorithms are able to discriminate the target.
The code for GSI and MSR metrics is defined in metrics.py
.
Run python demo.py
to see these how these scores are computed for samples from P3 and O3 datasets.
Computing the saliency maps
To compute saliency maps for images in P3 and O3 datasets:
- Download the datasets manually from http://data.nvision2.eecs.yorku.ca/P3O3/ or using the script in the
data
folder:
cd data
sh download_data.sh
-
Install SMILER. Follow the instructions in the official repository https://github.com/TsotsosLab/SMILER.
-
Run the models using the
yaml
files in theSMILER_experiments
folder (update the paths to the P3 and O3 images if needed) as follows:
./smiler run -e SMILER_O3.yaml
Note that depending on the system running all 20 models on P3 and O3 datasets may take several days. It is not recommended to run several experiments concurrently.
P3 image properties
There is a csv text file image_properties.txt
associated with each set of colors
, orientations
and sizes
images. It lists the following set of properties for each image:
path
- type of the image (colors, orientations or sizes)name
- image name (e.g. image_rectangle_-90_-30_45_15.png)bg_color
- hex representation of the background colort_pos
- target position in the 7x7 grid (from 1 to 49)t_x
,t_y
- pixel coordinates of the target centert_ori
- target orientation in degreesd_ori
- distractor orientation in degreest_color
- target color in hex representationd_color
- distractor color in hex representationt_shape
- target shape (e.g. rectangle, circle)d_shape
- distractor shapet_height
- target height in pixelsd_height
- distractor height in pixelst_height2
,d_height2
- optional height parameter for sume shapest_width
- target width in pixelsd_width
- distractor width in pixels
O3image properties
There is a csv text file image_properties.txt
listing the following properties for each image in O3 dataset:
image_name
- file namenum_distractors
- number of distractors in the imagetarget_type
- object category for the target (to be ignored as most categories are empty)target_subtype
- object sub-category (e.g. tulip, dress, pea)target_size
- largest dimension of the target in pixelstarget_x
,target_y
- pixel coordinates of the target centerorientation
,color
,focus
,shape
,size
,location
,pattern
- feature dimensions where target differs from the distractors (each can be set to 0 or 1)
Note: focus
refers to camera focus (e.g. target may be in focus and the rest of the scene not), pattern
is a catch-all feature for differences in texture, material or patterns on the objects, 'location' refers to grouping effects (e.g. distractors are close to one another and the target object is relatively far away).