FannyChao / MV-SalGAN360

The improved version of our previous work SalGAN360 which predict visual saliency on 360° image

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A Multi-FoV Viewport-based Visual Saliency ModelUsing Adaptive Weighting Losses for 360° Images

Abstract

360° media allows observers to explore the scene inall directions. The consequence is that the human visual attentionis guided by not only the perceived area in the viewport but alsothe overall content in 360°. In this paper, we propose a methodto estimate the 360° saliency map which extracts salient featuresfrom the entire 360° image in each viewport in three differentField of Views (FoVs). Our model is first pretrained with a large-scale 2D image dataset to enable the interpretation of semanticcontents, then fine-tuned with a relative small 360° image dataset. A novel weighting loss function attached with stretch weightedmaps is introduced to adaptively weight the losses of three evaluation metrics and attenuate the impact of stretched regions inequirectangular projection during training process. Experimentalresults demonstrate that our model achieves better performancewith the integration of three FoVs and its diverse viewportimages. Results also show that the adaptive weighting losses andstretch weighted maps effectively enhance the evaluation scorescompared to the fixed weighting losses solutions. Comparing toother state of the art models, our method surpasses them on three different datasets and ranks the top using 5 performanceevaluation metrics on the Salient360! benchmark set.

Architecture

diagram

Visual Results

  • Qualitative results on Salient360! 2017 dataset qualitative results on Salient360! 2017 dataset

  • Qualitative results on Saliency in VR dataset qualitative results on Saliency in VR dataset

Requirements

  • Download SalGAN
  • Python2
  • Lasagne, Theano
  • Matlab

Pretrained models

Usage

Replace 01-data_preprocessing.py, 02-train.py, 03-predict.py, model_salgan.py, dataRepresentation.py, model.py and utils.py in SalGAN.

  • Test: To predict saliency maps, run MV-salgan360.m after specifying the path to images and the path to the output saliency maps
  • Train:
      1. Run preprocessing_trainingdata.m to transfer 360° images into multiple viewports.
      1. Run 01-data_preprocessing.py to make pickle files of training images.
      1. Run 02-train.py to fine tune salgan model.

Citing

@ARTICLE{9122430,
  author={F. {Chao} and L. {Zhang} and W. {Hamidouche} and O. {Deforges}},
  journal={IEEE Transactions on Multimedia}, 
  title={A Multi-FoV Viewport-based Visual Saliency Model Using Adaptive Weighting Losses for 360° Images}, 
  year={2020},
  volume={},
  number={},
  pages={1-1},
  doi={10.1109/TMM.2020.3003642}}

About

The improved version of our previous work SalGAN360 which predict visual saliency on 360° image


Languages

Language:Python 75.4%Language:MATLAB 24.6%