On the Sins of Image Synthesis Loss for Self-supervised Depth Estimation

09/13/2021
by   Zhaoshuo Li, et al.
22

Scene depth estimation from stereo and monocular imagery is critical for extracting 3D information for downstream tasks such as scene understanding. Recently, learning-based methods for depth estimation have received much attention due to their high performance and flexibility in hardware choice. However, collecting ground truth data for supervised training of these algorithms is costly or outright impossible. This circumstance suggests a need for alternative learning approaches that do not require corresponding depth measurements. Indeed, self-supervised learning of depth estimation provides an increasingly popular alternative. It is based on the idea that observed frames can be synthesized from neighboring frames if accurate depth of the scene is known - or in this case, estimated. We show empirically that - contrary to common belief - improvements in image synthesis do not necessitate improvement in depth estimation. Rather, optimizing for image synthesis can result in diverging performance with respect to the main prediction objective - depth. We attribute this diverging phenomenon to aleatoric uncertainties, which originate from data. Based on our experiments on four datasets (spanning street, indoor, and medical) and five architectures (monocular and stereo), we conclude that this diverging phenomenon is independent of the dataset domain and not mitigated by commonly used regularization techniques. To underscore the importance of this finding, we include a survey of methods which use image synthesis, totaling 127 papers over the last six years. This observed divergence has not been previously reported or studied in depth, suggesting room for future improvement of self-supervised approaches which might be impacted the finding.

READ FULL TEXT

page 8

page 15

research
04/14/2020

RealMonoDepth: Self-Supervised Monocular Depth Estimation for General Scenes

We present a generalised self-supervised learning approach for monocular...
research
06/04/2018

Digging Into Self-Supervised Monocular Depth Estimation

Depth-sensing is important for both navigation and scene understanding. ...
research
08/17/2020

Self-Supervised Learning for Monocular Depth Estimation from Aerial Imagery

Supervised learning based methods for monocular depth estimation usually...
research
03/31/2020

Self-supervised Monocular Trained Depth Estimation using Self-attention and Discrete Disparity Volume

Monocular depth estimation has become one of the most studied applicatio...
research
09/17/2019

Spherical View Synthesis for Self-Supervised 360 Depth Estimation

Learning based approaches for depth perception are limited by the availa...
research
08/08/2019

Enhancing self-supervised monocular depth estimation with traditional visual odometry

Estimating depth from a single image represents an attractive alternativ...
research
04/22/2021

H-Net: Unsupervised Attention-based Stereo Depth Estimation Leveraging Epipolar Geometry

Depth estimation from a stereo image pair has become one of the most exp...

Please sign up or login with your details

Forgot password? Click here to reset