Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction

by   Lokender Tiwari, et al.

Classical monocular Simultaneous Localization And Mapping (SLAM) and the recently emerging convolutional neural networks (CNNs) for monocular depth prediction represent two largely disjoint approaches towards building a 3D map of the surrounding environment. In this paper, we demonstrate that the coupling of these two by leveraging the strengths of each mitigates the other's shortcomings. Specifically, we propose a joint narrow and wide baseline based self-improving framework, where on the one hand the CNN-predicted depth is leveraged to perform pseudo RGB-D feature-based SLAM, leading to better accuracy and robustness than the monocular RGB SLAM baseline. On the other hand, the bundle-adjusted 3D scene structures and camera poses from the more principled geometric SLAM are injected back into the depth network through novel wide baseline losses proposed for improving the depth prediction network, which then continues to contribute towards better pose and 3D structure estimation in the next iteration. We emphasize that our framework only requires unlabeled monocular videos in both training and inference stages, and yet is able to outperform state-of-the-art self-supervised monocular and stereo depth prediction networks (e.g, Monodepth2) and feature-based monocular SLAM system (i.e, ORB-SLAM). Extensive experiments on KITTI and TUM RGB-D datasets verify the superiority of our self-improving geometry-CNN framework.


page 12

page 18

page 20


CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction

Given the recent advances in depth prediction from Convolutional Neural ...

A Front-End for Dense Monocular SLAM using a Learned Outlier Mask Prior

Recent achievements in depth prediction from a single RGB image have pow...

SLAM Endoscopy enhanced by adversarial depth prediction

Medical endoscopy remains a challenging application for simultaneous loc...

Unsupervised Scale-consistent Depth Learning from Video

We propose a monocular depth estimator SC-Depth, which requires only unl...

Scale-aware direct monocular odometry

We present a framework for direct monocular odometry based on depth pred...

Online Mutual Adaptation of Deep Depth Prediction and Visual SLAM

The ability of accurate depth prediction by a CNN is a major challenge f...

Visual SLAM: What are the Current Trends and What to Expect?

Vision-based sensors have shown significant performance, accuracy, and e...

Please sign up or login with your details

Forgot password? Click here to reset