Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos

11/15/2018
by   Vincent Casser, et al.
30

Learning to predict scene depth from RGB inputs is a challenging task both for indoor and outdoor robot navigation. In this work we address unsupervised learning of scene depth and robot ego-motion where supervision is provided by monocular videos, as cameras are the cheapest, least restrictive and most ubiquitous sensor for robotics. Previous work in unsupervised image-to-depth learning has established strong baselines in the domain. We propose a novel approach which produces higher quality results, is able to model moving objects and is shown to transfer across data domains, e.g. from outdoors to indoor scenes. The main idea is to introduce geometric structure in the learning process, by modeling the scene and the individual objects; camera ego-motion and object motions are learned from monocular videos as input. Furthermore an online refinement method is introduced to adapt learning on the fly to unknown domains. The proposed approach outperforms all state-of-the-art approaches, including those that handle motion e.g. through learned flow. Our results are comparable in quality to the ones which used stereo as supervision and significantly improve depth prediction on scenes and datasets which contain a lot of object motion. The approach is of practical relevance, as it allows transfer across environments, by transferring models trained on data collected for robot navigation in urban scenes to indoor navigation settings. The code associated with this paper can be found at https://sites.google.com/view/struct2depth.

READ FULL TEXT

page 1

page 3

page 5

page 6

page 7

research
06/12/2019

Unsupervised Monocular Depth and Ego-motion Learning with Structure and Semantics

We present an approach which takes advantage of both structure and seman...
research
06/04/2020

Unsupervised Depth Learning in Challenging Indoor Video: Weak Rectification to Rescue

Single-view depth estimation using CNNs trained from unlabelled videos h...
research
10/30/2020

Unsupervised Monocular Depth Learning in Dynamic Scenes

We present a method for jointly training the estimation of depth, ego-mo...
research
10/20/2019

Moving Indoor: Unsupervised Video Depth Learning in Challenging Environments

Recently unsupervised learning of depth from videos has made remarkable ...
research
09/12/2018

Learning structure-from-motionfrom motion

This work is based on a questioning of the quality metrics used by deep ...
research
04/10/2019

Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras

We present a novel method for simultaneous learning of depth, egomotion,...
research
05/05/2021

Moving SLAM: Fully Unsupervised Deep Learning in Non-Rigid Scenes

We propose a method to train deep networks to decompose videos into 3D g...

Please sign up or login with your details

Forgot password? Click here to reset