GLPanoDepth: Global-to-Local Panoramic Depth Estimation

02/06/2022
by   Jiayang Bai, et al.
0

In this paper, we propose a learning-based method for predicting dense depth values of a scene from a monocular omnidirectional image. An omnidirectional image has a full field-of-view, providing much more complete descriptions of the scene than perspective images. However, fully-convolutional networks that most current solutions rely on fail to capture rich global contexts from the panorama. To address this issue and also the distortion of equirectangular projection in the panorama, we propose Cubemap Vision Transformers (CViT), a new transformer-based architecture that can model long-range dependencies and extract distortion-free global features from the panorama. We show that cubemap vision transformers have a global receptive field at every stage and can provide globally coherent predictions for spherical signals. To preserve important local features, we further design a convolution-based branch in our pipeline (dubbed GLPanoDepth) and fuse global features from cubemap vision transformers at multiple scales. This global-to-local strategy allows us to fully exploit useful global and local features in the panorama, achieving state-of-the-art performance in panoramic depth estimation.

READ FULL TEXT

page 1

page 3

page 6

page 7

page 8

research
03/24/2021

Vision Transformers for Dense Prediction

We introduce dense vision transformers, an architecture that leverages v...
research
01/14/2023

S^2Net: Accurate Panorama Depth Estimation on Spherical Surface

Monocular depth estimation is an ambiguous problem, thus global structur...
research
03/06/2023

DwinFormer: Dual Window Transformers for End-to-End Monocular Depth Estimation

Depth estimation from a single image is of paramount importance in the r...
research
04/29/2022

SideRT: A Real-time Pure Transformer Architecture for Single Image Depth Estimation

Since context modeling is critical for estimating depth from a single im...
research
03/02/2022

OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion

A well-known challenge in applying deep-learning methods to omnidirectio...
research
04/16/2023

EGformer: Equirectangular Geometry-biased Transformer for 360 Depth Estimation

Estimating the depths of equirectangular (360) images (EIs) is challengi...
research
08/28/2023

PanoSwin: a Pano-style Swin Transformer for Panorama Understanding

In panorama understanding, the widely used equirectangular projection (E...

Please sign up or login with your details

Forgot password? Click here to reset