Unsupervised Scale-consistent Depth Learning from Video

by   Jia-Wang Bian, et al.

We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training and enables the scale-consistent prediction at inference time. Our contributions include: (i) we propose a geometry consistency loss, which penalizes the inconsistency of predicted depths between adjacent views; (ii) we propose a self-discovered mask to automatically localize moving objects that violate the underlying static scene assumption and cause noisy signals during training; (iii) we demonstrate the efficacy of each component with a detailed ablation study and show high-quality depth estimation results in both KITTI and NYUv2 datasets. Moreover, thanks to the capability of scale-consistent prediction, we show that our monocular-trained deep networks are readily integrated into the ORB-SLAM2 system for more robust and accurate tracking. The proposed hybrid Pseudo-RGBD SLAM shows compelling results in KITTI, and it generalizes well to the KAIST dataset without additional training. Finally, we provide several demos for qualitative evaluation.


page 2

page 5

page 6

page 7

page 9

page 10

page 13


Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video

Recent work has shown that CNN-based depth and ego-motion estimators can...

Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction

Classical monocular Simultaneous Localization And Mapping (SLAM) and the...

ADAADepth: Adapting Data Augmentation and Attention for Self-Supervised Monocular Depth Estimation

Self-supervised learning of depth has been a highly studied topic of res...

Region Deformer Networks for Unsupervised Depth Estimation from Unconstrained Monocular Videos

While learning based depth estimation from images/videos has achieved su...

Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras

We present a novel method for simultaneous learning of depth, egomotion,...

Instance-wise Depth and Motion Learning from Monocular Videos

We present an end-to-end joint training framework that explicitly models...

SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark under Multiple Environments

Monocular depth prediction has been well studied recently, while there a...