Monocular Depth Estimation through Virtual-world Supervision and Real-world SfM Self-Supervision

03/22/2021
by   Akhil Gurram, et al.
0

Depth information is essential for on-board perception in autonomous driving and driver assistance. Monocular depth estimation (MDE) is very appealing since it allows for appearance and depth being on direct pixelwise correspondence without further calibration. Best MDE models are based on Convolutional Neural Networks (CNNs) trained in a supervised manner, i.e., assuming pixelwise ground truth (GT). Usually, this GT is acquired at training time through a calibrated multi-modal suite of sensors. However, also using only a monocular system at training time is cheaper and more scalable. This is possible by relying on structure-from-motion (SfM) principles to generate self-supervision. Nevertheless, problems of camouflaged objects, visibility changes, static-camera intervals, textureless areas, and scale ambiguity, diminish the usefulness of such self-supervision. In this paper, we perform monocular depth estimation by virtual-world supervision (MonoDEVS) and real-world SfM self-supervision. We compensate the SfM self-supervision limitations by leveraging virtual-world images with accurate semantic and depth supervision and addressing the virtual-to-real domain gap. Our MonoDEVSNet outperforms previous MDE CNNs trained on monocular and even stereo sequences.

READ FULL TEXT

page 1

page 4

page 10

page 11

research
03/21/2018

Monocular Depth Estimation by Learning from Heterogeneous Datasets

Depth estimation provides essential information to perform autonomous dr...
research
09/16/2020

Calibrating Self-supervised Monocular Depth Estimation

In the recent years, many methods demonstrated the ability of neural net...
research
10/04/2019

Robust Semi-Supervised Monocular Depth Estimation with Reprojected Distances

Dense depth estimation from a single image is a key problem in computer ...
research
04/23/2021

Co-training for Deep Object Detection: Comparing Single-modal and Multi-modal Approaches

Top-performing computer vision models are powered by convolutional neura...
research
05/17/2017

Self-Supervised Siamese Learning on Stereo Image Pairs for Depth Estimation in Robotic Surgery

Robotic surgery has become a powerful tool for performing minimally inva...
research
10/04/2022

PlaneDepth: Plane-Based Self-Supervised Monocular Depth Estimation

Self-supervised monocular depth estimation refers to training a monocula...
research
02/28/2023

Learning to Estimate Single-View Volumetric Flow Motions without 3D Supervision

We address the challenging problem of jointly inferring the 3D flow and ...

Please sign up or login with your details

Forgot password? Click here to reset