DiPE: Deeper into Photometric Errors for Unsupervised Learning of Depth and Ego-motion from Monocular Videos

by   Hualie Jiang, et al.

Unsupervised learning of depth and ego-motion from unlabelled monocular videos has recently drawn attention as it has notable advantages than the supervised ones. It uses the photometric errors between the target view and the synthesized views from its adjacent source views as the loss. Although significant progress has been made, the learning still suffers from occlusion and scene dynamics. This paper shows that carefully manipulating photometric errors can tackle these difficulties better. The primary improvement is achieved by masking out the invisible or nonstationary pixels in the photometric error map using a statistical technique. With this outlier masking approach, the depth of objects that move in the opposite direction to the camera can be estimated more accurately. According to our best knowledge, such objects have not been seriously considered in the previous work, even though they pose a higher risk in applications like autonomous driving. We also propose an efficient weighted multi-scale scheme to reduce the artifacts in the predicted depth maps. Extensive experiments on the KITTI dataset show the effectiveness of the proposed approaches. The overall system achieves state-of-the-art performance on both depth and ego-motion estimation.


page 1

page 2

page 5

page 6


Unsupervised Learning of Monocular Depth and Ego-Motion Using Multiple Masks

A new unsupervised learning method of depth and ego-motion using multipl...

3D Hierarchical Refinement and Augmentation for Unsupervised Learning of Depth and Pose from Monocular Video

Depth and ego-motion estimations are essential for the localization and ...

Semantics-Driven Unsupervised Learning for Monocular Depth and Ego-Motion Estimation

We propose a semantics-driven unsupervised learning approach for monocul...

Effectiveness of 3VQM in Capturing Depth Inconsistencies

The 3D video quality metric (3VQM) was proposed to evaluate the temporal...

Masked GANs for Unsupervised Depth and Pose Prediction with Scale Consistency

Previous works have shown that adversarial learning can be used for unsu...

Optimization of Occlusion-Inducing Depth Pixels in 3-D Video Coding

The optimization of occlusion-inducing depth pixels in depth map coding ...

Unsupervised Monocular Depth and Ego-motion Learning with Structure and Semantics

We present an approach which takes advantage of both structure and seman...

Please sign up or login with your details

Forgot password? Click here to reset