Self-Supervised Learning of Depth and Ego-Motion from Video by Alternative Training and Geometric Constraints from 3D to 2D

08/04/2021
by   Jiaojiao Fang, et al.
14

Self-supervised learning of depth and ego-motion from unlabeled monocular video has acquired promising results and drawn extensive attention. Most existing methods jointly train the depth and pose networks by photometric consistency of adjacent frames based on the principle of structure-from-motion (SFM). However, the coupling relationship of the depth and pose networks seriously influences the learning performance, and the re-projection relations is sensitive to scale ambiguity, especially for pose learning. In this paper, we aim to improve the depth-pose learning performance without the auxiliary tasks and address the above issues by alternative training each task and incorporating the epipolar geometric constraints into the Iterative Closest Point (ICP) based point clouds match process. Distinct from jointly training the depth and pose networks, our key idea is to better utilize the mutual dependency of these two tasks by alternatively training each network with respective losses while fixing the other. We also design a log-scale 3D structural consistency loss to put more emphasis on the smaller depth values during training. To makes the optimization easier, we further incorporate the epipolar geometry into the ICP based learning process for pose learning. Extensive experiments on various benchmarks datasets indicate the superiority of our algorithm over the state-of-the-art self-supervised methods.

READ FULL TEXT

page 1

page 3

page 6

page 7

research
09/19/2019

Self-Supervised Learning of Depth and Motion Under Photometric Inconsistency

The self-supervised learning of depth and pose from monocular sequences ...
research
04/18/2023

Pose Constraints for Consistent Self-supervised Monocular Depth and Ego-motion

Self-supervised monocular depth estimation approaches suffer not only fr...
research
06/07/2021

Self-Supervised Structure-from-Motion through Tightly-Coupled Depth and Egomotion Networks

Much recent literature has formulated structure-from-motion (SfM) as a s...
research
07/12/2019

Self-supervised Learning with Geometric Constraints in Monocular Video: Connecting Flow, Depth, and Camera

We present GLNet, a self-supervised framework for learning depth, optica...
research
10/04/2019

Two Stream Networks for Self-Supervised Ego-Motion Estimation

Learning depth and camera ego-motion from raw unlabeled RGB video stream...
research
12/23/2018

Epipolar Geometry based Learning of Multi-view Depth and Ego-Motion from Monocular Sequences

Deep approaches to predict monocular depth and ego-motion have grown in ...
research
12/12/2022

CbwLoss: Constrained Bidirectional Weighted Loss for Self-supervised Learning of Depth and Pose

Photometric differences are widely used as supervision signals to train ...

Please sign up or login with your details

Forgot password? Click here to reset