Self-supervised Monocular Trained Depth Estimation using Self-attention and Discrete Disparity Volume

03/31/2020
by   Adrian Johnston, et al.
0

Monocular depth estimation has become one of the most studied applications in computer vision, where the most accurate approaches are based on fully supervised learning models. However, the acquisition of accurate and large ground truth data sets to model these fully supervised methods is a major challenge for the further development of the area. Self-supervised methods trained with monocular videos constitute one the most promising approaches to mitigate the challenge mentioned above due to the wide-spread availability of training data. Consequently, they have been intensively studied, where the main ideas explored consist of different types of model architectures, loss functions, and occlusion masks to address non-rigid motion. In this paper, we propose two new ideas to improve self-supervised monocular trained depth estimation: 1) self-attention, and 2) discrete disparity prediction. Compared with the usual localised convolution operation, self-attention can explore a more general contextual information that allows the inference of similar disparity values at non-contiguous regions of the image. Discrete disparity prediction has been shown by fully supervised methods to provide a more robust and sharper depth estimation than the more common continuous disparity prediction, besides enabling the estimation of depth uncertainty. We show that the extension of the state-of-the-art self-supervised monocular trained depth estimator Monodepth2 with these two ideas allows us to design a model that produces the best results in the field in KITTI 2015 and Make3D, closing the gap with respect self-supervised stereo training and fully supervised approaches.

READ FULL TEXT

page 1

page 2

page 7

page 8

page 12

page 13

research
02/20/2023

Self-Supervised Monocular Depth Estimation with Self-Reference Distillation and Disparity Offset Refinement

Monocular depth estimation plays a fundamental role in computer vision. ...
research
06/04/2018

Digging Into Self-Supervised Monocular Depth Estimation

Depth-sensing is important for both navigation and scene understanding. ...
research
07/20/2023

OCTraN: 3D Occupancy Convolutional Transformer Network in Unstructured Traffic Scenarios

Modern approaches for vision-centric environment perception for autonomo...
research
11/20/2022

Hybrid Transformer Based Feature Fusion for Self-Supervised Monocular Depth Estimation

With an unprecedented increase in the number of agents and systems that ...
research
03/01/2021

ADAADepth: Adapting Data Augmentation and Attention for Self-Supervised Monocular Depth Estimation

Self-supervised learning of depth has been a highly studied topic of res...
research
09/13/2021

On the Sins of Image Synthesis Loss for Self-supervised Depth Estimation

Scene depth estimation from stereo and monocular imagery is critical for...
research
09/07/2022

BiFuse++: Self-supervised and Efficient Bi-projection Fusion for 360 Depth Estimation

Due to the rise of spherical cameras, monocular 360 depth estimation bec...

Please sign up or login with your details

Forgot password? Click here to reset