A Front-End for Dense Monocular SLAM using a Learned Outlier Mask Prior

by   Yihao Zhang, et al.

Recent achievements in depth prediction from a single RGB image have powered the new research area of combining convolutional neural networks (CNNs) with classical simultaneous localization and mapping (SLAM) algorithms. The depth prediction from a CNN provides a reasonable initial point in the optimization process in the traditional SLAM algorithms, while the SLAM algorithms further improve the CNN prediction online. However, most of the current CNN-SLAM approaches have only taken advantage of the depth prediction but not yet other products from a CNN. In this work, we explore the use of the outlier mask, a by-product from unsupervised learning of depth from video, as a prior in a classical probability model for depth estimate fusion to step up the outlier-resistant tracking performance of a SLAM front-end. On the other hand, some of the previous CNN-SLAM work builds on feature-based sparse SLAM methods, wasting the per-pixel dense prediction from a CNN. In contrast to these sparse methods, we devise a dense CNN-assisted SLAM front-end that is implementable with TensorFlow and evaluate it on both indoor and outdoor datasets.


page 1

page 5

page 6


CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction

Given the recent advances in depth prediction from Convolutional Neural ...

Online Mutual Adaptation of Deep Depth Prediction and Visual SLAM

The ability of accurate depth prediction by a CNN is a major challenge f...

RGB-D SLAM Using Attention Guided Frame Association

Deep learning models as an emerging topic have shown great progress in v...

Sequential Learning of Visual Tracking and Mapping Using Unsupervised Deep Neural Networks

We proposed an end-to-end deep learning-based simultaneous localization ...

gradSLAM: Dense SLAM meets Automatic Differentiation

The question of "representation" is central in the context of dense simu...

Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction

Classical monocular Simultaneous Localization And Mapping (SLAM) and the...

SemanticFusion: Dense 3D Semantic Mapping with Convolutional Neural Networks

Ever more robust, accurate and detailed mapping using visual sensing has...