(We are updating the information and adjusting the pages on this code! If you want to provide some good papers, please send us on the issues! Hope that we can provide some intreseting works for the omnidirectional vision!)

♦ Deep Learning for Omnidirectional Vision: A Survey and New Perspectives

Referenced paper : Deep Learning for Omnidirectional Vision: A Survey and New Perspectives

Table of Content

♥ Introduction

Omnidirectional image (ODI) data is captured with a 360°×180° field-of-view and omnidirectional vision has attracted booming attention due to its more advantageous performance in numerous applications. Our survey presents a systematic and comprehensive review and analysis of the recent progress in Deep Learning methods for omnidirectional vision.

Especially, we create this open-source repository to provide a taxonomy of all the mentioned works and code links in the survey. We will keep updating our open-source repository with new works in this area and hope it can shed light on future research and build a community for researchers on omnidirectional vision.

♥ Background

Acquisition

An ideal 360° camera can capture lights falling on the focal point from all directions, making the projection plane a whole spherical surface. In practice, most 360° cameras can not achieve it, which excludes top and bottom regions due to dead angles

♥ Convolution/Transformer Methods on ODIs

- Learning Spherical Convolution for Fast Features from 360° Imagery (2017) Paper Code

- Learning SO(3) Equivariant Representations with Spherical CNNs (Point cloud) (2018) Paper Code

- Spherical CNNs (2018) Paper Code

- SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images (2018) Paper Code

- Geometry Aware Convolutional Filters for Omnidirectional Images Representation (Graph-based) (2019) Paper Code

- Kernel Transformer Networks for Compact Spherical Convolution (2019) Paper Code

- SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360° Images (2019) Paper Code Reproduced Code

- Rotation Equivariant Graph Convolutional Network for Spherical Image Classification (2020) Paper Code

- Deepsphere: A Graph-based Spherical CNN (2020) Paper Code

- Equivariant Networks for Pixelized Spheres (2021) Paper Code

- Equivariance versus Augmentation for Spherical Images (2022) Paper Code

- Gauge Equivariant Convolutional Networks and the Icosahedral CNN (2019) Paper Code

- Spherical Transformer (2022) Paper Code

♥ Omnidirectional Vision Tasks

😃 Image&Video Manipulation

😉 Omnidirectional Image Generation (Completion)

- 360 Panorama Synthesis from a Sparse Set of Images with Unknown Field of View Paper Code

- 360-Degree Image Completion by Two-Stage Conditional Gans Paper Code

- BIPS: Bi-modal Indoor Panorama Synthesis via Residual Depth-aided Adversarial Learning (RGB and Depth) Paper Code

- Diverse Plausible 360-Degree Image Outpainting for Efficient 3DCG Background Creation Paper Code

- Spherical Image Generation from a Single Normal Field of View Image by Considering Scene Symmetry Paper Code

- Text2Light: Zero-Shot Text-Driven HDR Panorama GenerationText2Light: Zero-Shot Text-Driven HDR Panorama Generation Paper Code

😉 Omnidirectional Image&Video Compression

😉 Omnidirectional Image&Video Cross-View Synthesis

😉 Omnidirectional Image&Video Lighting Estimation

😉 Omnidirectional Image&Video Super-Resolution

😉 Omnidirectional Image&Video Upright Adjustment

😉 Omnidirectional Image&Video Visual Quality Assessment

😃 Scene Understanding

😉 Object Detection

😉 Semantic Segmentation

😉 Optical Flow Estimation

😉 Video Summraization

😉 Monocular Depth Estimation

- Distortion-Aware Convolutional Filters for Dense Prediction in Panoramic Images (2018) Paper Code

- OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas (2018) Paper Code

- Spherical View Synthesis for Self-Supervised 360-degree Depth Estimation (2019) Paper Code

- Pano Popups: Indoor 3D Reconstruction with a Plane-Aware Network. (2019) Paper Code

- BiFuse: Monocular 360-degree Depth Estimation via Bi-Projection Fusion (2020) Paper Code

- Geometric Structure Based and Regularized Depth Estimation From 360-degree Indoor Imagery (2020) Paper Code

- PADENet: An Efficient and Robust Panoramic Monocular Depth Estimation Network for Outdoor Scenes (2020) Paper Code)

- UniFuse: Unidirectional Fusion for 360° Panorama Depth Estimation (2021) Paper Code

- Improving 360 Monocular Depth Estimation via Non-local Dense Prediction Transformer and Joint Supervised and Self-supervised Learning (2021) Paper Code

- SliceNet: deep dense depth estimation from a single indoor panorama using a slice-based representation (2021) Paper Code

- PanoDepth: A Two-Stage Approach for Monocular Omnidirectional Depth Estimation (2021) Paper Code

- Depth360: Self-supervised Learning for Monocular Depth Estimation using Learnable Camera Distortion Model (2021) Paper Code

- OmniFusion : 360 Monocular Depth Estimation via Geometry-Aware Fusion (2022) Paper Code

- 360MonoDepth: High-Resolution 360° Monocular Depth Estimation (2022) Paper Code

- ACDNet: Adaptively Combined Dilated Convolution for Monocular Panorama Depth Estimation (2022) Paper Code

- GLPanoDepth: Global-to-Local Panoramic Depth Estimation (2022) Paper Code

- 360 Depth Estimation in the Wild: The Depth360 Dataset and the SegFuse Network (2022) Paper Code

- Deep Depth Estimation on 360° Images with a Double Quaternion Loss (2022) Paper Code

- PanoFormer: Panorama Transformer for Indoor 360 Depth Estimation (2022) Paper Code

- Multi-Modal Masked Pre-Training for Monocular Panoramic Depth Completion (2022) Paper Code

- Self-supervised indoor 360- degree depth estimation via structural regularization (2022) Paper Code

- Variational Depth Estimation on Hypersphere for Panorama (2022) Paper Code

- SphereDepth: Panorama Depth Estimation from Spherical Domain (2022) Paper Code

😃 3D Vision

😉 SLAM

😉 Stereo Matching

😉 Room Layout estimation and Reconstruction

😃 Human-Machine Interaction

😉 Saliency Prediction

😉 Visual Question Answering

😉 Gaze Behavior

😉 Audio-Visual Scene Understanding

♥ Novel Learning Strategies

♥ Applications

♥ Disscussion and New Perspectives

3D reconstruction

Efficient 3D Room Shape Recovery (traditional)
MVLayoutNet: 3D layout reconstruction with multi-view panoramas
HeadFusion: 360◦Head Pose tracking combining3D Morphable Model and 3D Reconstruction
Manhattan Room Layout Reconstruction from a Single 360◦ image: A Comparative Study of State-of-the-art Methods
Pano Popups: Indoor 3D Reconstruction with a Plane-Aware Network
Robust 3D reconstruction with omni-directional camera based on structure from motion
CAN WE USE LOW-COST 360 DEGREE CAMERAS TO CREATE ACCURATE 3D MODELS

CNN

Learning Spherical Convolution for Fast Features from 360°Imagery
Learning SO(3) Equivariant Representationswith Spherical CNNs (点云数据集)
SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360-degree Images
CONCENTRIC SPHERICAL GNN FOR 3D REPRESENTATION LEARNING
SPHERICAL CNNS
Rotation Equivariant Graph Convolutional Network for Spherical Image Classification
Self-supervised Representation Learning Using 360◦ Data
SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images

Data Synthesis

BIPS: Bi-modal Indoor Panorama Synthesis via Residual Depth-aided Adversarial Learning
Sat2Vid: Street-view Panoramic Video Synthesis from a Single Satellite Image
Snap Angle Prediction for 360◦ Panoramas

Dataset

Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans
Refer360-degree: A Referring Expression Recognition Dataset in 360-degree Images
360-Indoor: Towards Learning Real-World Objects in 360◦ Indoor Equirectangular Images
Recognizing Scene Viewpoint using Panoramic Place Representation
Zillow Indoor Dataset: Annotated Floor Plans With 360o Panoramas and 3D Room Layouts
Matterport3D: Learning from RGB-D Data in Indoor Environments
A Saliency Dataset for 360-Degree Videos
360-degree Video Gaze Behaviour: A Ground-Truth Data Set and a Classification Algorithm for Eye Movements
AVTrack360: An open Dataset and Soware recording people’sHead Rotations watching 360◦Videos on an HMD
A Large-scale Compressed 360-Degree Spherical Image database: from Subjective Quality Evaluation to Objective Model Comparison
A Dataset of Head and Eye Movements for 360 Degree Images

Depth Estimation

360 Depth Estimation in the Wild -- The Depth360 Dataset and the SegFuse Network
360MonoDepth: High-Resolution 360° Monocular Depth Estimation
ACDNet: Adaptively Combined Dilated Convolution for Monocular Panorama Depth Estimation
BiFuse: Monocular 360-degree Depth Estimation via Bi-Projection Fusion
Improving 360◦ Monocular Depth Estimation via Non-local Dense Prediction Transformer and Joint Supervised and Self-supervised Learning
Distortion-Aware Convolutional Filters for Dense Prediction in Panoramic Images
GLPanoDepth: Global-to-Local Panoramic Depth Estimation
Geometric Structure Based and Regularized Depth Estimation From 360-degree Indoor Imagery
OmniFusion : 360 Monocular Depth Estimation via Geometry-Aware Fusion
PanoDepth: A Two-Stage Approach for Monocular Omnidirectional Depth Estimation
SliceNet: deep dense depth estimation from a single indoor panorama using a slice-based representation
OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas
Spherical View Synthesis for Self-Supervised 360-degree Depth Estimation

Gaze following/Gaze estimation

Looking here or there? Gaze Following in 360-Degree Images
Gaze Prediction in Dynamic 360◦ Immersive Videos
Self-view Grounding Given a Narrated 360◦ Video
Gaze360: Physically Unconstrained Gaze Estimation in the Wild

Highlight detection

See360: Novel Panoramic View Interpolation
A Deep Ranking Model for Spatio-Temporal Highlight Detection from a 360-degree Video

Image Compression

Pseudocylindrical Convolutions for LearnedOmnidirectional Image Compression

Image Super-resolution

LAU-Net: Latitude Adaptive Upscaling Network for Omnidirectional Image Super-resolution
SphereSR: 360-degree Image Super-Resolution with Arbitrary Projection via Continuous Spherical Image Representation

Inpainting

Panoramic Image Reflection Removal
Privacy Protection in Street-View Panoramas using Depth and Multi-View Imagery

Object Detetction

Field-of-View IoU for Object Detection in 360{%20deg} Images
Deep 360 Pilot: Learning a Deep Agent for Piloting through 360◦ Sports Videos
Spherical Criteria for Fast and Accurate 360-degree Object Detection
Kernel Transformer Networks for Compact Spherical Convolution

Omnidirectional Localization

PICCOLO: Point Cloud-Centric Omnidirectional Localization (traditional)
OmniSLAM: Omnidirectional Localization and Dense Mapping for Wide-baseline Multi-camera Systems

Orientation Estimation

Rotation Equivariant Orientation Estimation for Omnidirectional Localization

Outdoor Lighting Estimation

Learning High Dynamic Range from Outdoor Panoramas
Spatially-Varying Outdoor Lighting Estimation from Intrinsics

Panoramic Stitching

Minimal Solutions for Panoramic Stitching Given Gravity Prior (traditional)

Room Layout Estimation

AtlantaNet: Inferring the 3D Indoor Layout from a Single 360◦ Image beyond the Manhattan World Assumption
Deep3DLayout: 3D Reconstruction of an Indoor Layout from a Spherical Panoramic Image
Pano2CAD: Room Layout From A Single Panorama Image
DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama
HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation
OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas
SSLayout360: Semi-Supervised Indoor Layout Estimation from 360◦ Panorama
Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image
Corners for Layout: End-to-End Layout Recovery From 360 Images
LED2-Net: Monocular 360◦ Layout Estimation via Differentiable Depth Rendering
Manhattan Room Layout Reconstruction from a Single 360 image: A Comparative Study of State-of-the-art Methods

Saliency Prediction

Cube Padding for Weakly-Supervised Saliency Prediction in 360◦Videos
SalGCN: Saliency Prediction for 360-Degree Images Based on Spherical Graph Convolutional Networks
Rethinking 360° Image Visual Attention Modelling with Unsupervised Learning
Saliency Detection in 360◦Videos
Your Attention is Unique: Detecting 360-Degree Video Saliency in Head-Mounted Display for Head Movement Prediction
360-aware saliency estimation with conventional image saliency predictors **(traditional)

Scene Understanding

DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization
PanoContext: A Whole-Room 3D Context Model for Panoramic Scene Understanding (traditional)
Lighting, Reflectance and Geometry Estimation from 360◦ Panoramic Stereo
HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features
Automatic 3D Indoor Scene Modeling from Single Panorama
Im2Pano3D: Extrapolating 360° Structure and Semantics Beyond the Field of View
Eliminating the Blind Spot: Adapting 3D Object Detection and Monocular Depth Estimation to 360◦ Panoramic Imagery
Recovering 3D existing-conditions of indoor structures from spherical images (traditional)

Stereo Matching

OmniMVS: End-to-End Learning for Omnidirectional Stereo Matching

Structure from Motion

Extreme Structure from Motion for Indoor Panoramas without Visual Overlaps

Survey

State-of-the-Art in 360° Video/Image Processing: Perception, Assessment and Compression
A Survey on Adaptive 360◦ Video Streaming: Solutions, Challenges and Opportunities
Annotated 360-Degree Image and Video Databases: A Comprehensive Survey
3D Scene Geometry Estimation from 360◦ Imagery: A Survey

Upright Adjustment

360o Camera Alignment via Segmentation
Deep Upright Adjustment of 360 Panoramas Using Multiple Roll Estimations

VR Sickness Assessment

VRSA Net: VR Sickness Assessment Considering Exceptional Motion for 360° VR Video
Advanced Spherical Motion Model and Local Padding for 360° Video Compression (traditional)

Video Compression

Learning Compressible 360◦Video Isomers

Video Summarization

A Memory Network Approach for Story-Based Temporal Summarization of 360° Videos
Pano2Vid: Automatic Cinematography for Watching 360-degree Videos

View Synthesis

Automatic Content-aware Projection for 360◦ Videos (traditional)
Deep Multi Depth Panoramas for View Synthesis

Visual Quality Assessment

Viewport Proposal CNN for 360° Video Quality Assessment
Cross-Reference Stitching Quality Assessmentfor 360◦Omnidirectional Images
Cubemap-Based Perception-Driven Blind Quality Assessment for 360-degree Images
MC360IQA: A Multi-channel CNN for Blind360-Degree Image Quality Assessment

Visual Question Answering

Visual Question Answering on 360{%20deg} Images
Pano-AVQA: Grounded Audio-Visual Question Answering on 360-degree Videos

semantic segmentation

Transfer beyond the Field of View: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation
What’s in my Room? Object Recognition on Indoor Panoramic Images
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation
Omnisupervised Omnidirectional Semantic Segmentation
DensePASS: Dense Panoramic Semantic Segmentation via UnsupervisedDomain Adaptation with Attention-Augmented Context Exchange
DS-PASS: Detail-Sensitive Panoramic Annular Semantic Segmentation
Capturing Omni-Range Context for Omnidirectional Segmentation
Orientation-Aware Semantic Segmentation on Icosahedron Spheres

Citation

If you found our survey helpful for your research, please cite our paper as:

@article{Ai2022DeepLF,
  title={Deep Learning for Omnidirectional Vision: A Survey and New Perspectives},
  author={Hao Ai and Zidong Cao and Jin Zhu and Haotian Bai and Yucheng Chen and Ling Wang},
  journal={ArXiv},
  year={2022},
  volume={abs/2205.10468}
}