Learning Deeply Supervised Visual Descriptors for Dense Monocular Reconstruction

Visual SLAM (Simultaneous Localization and Mapping) methods typically rely on handcrafted visual features or raw RGB values for establishing correspondences between images. These features, while suitable for sparse mapping, often lead to ambiguous matches at texture-less regions when performing dense reconstruction due to the aperture problem. In this work, we explore the use of learned features for the matching task in dense monocular reconstruction. We propose a novel convolutional neural network (CNN) architecture along with a deeply supervised feature learning scheme for pixel-wise regression of visual descriptors from an image which are best suited for dense monocular SLAM. In particular, our learning scheme minimizes a multi-view matching cost-volume loss with respect to the regressed features at multiple stages within the network, for explicitly learning contextual features that are suitable for dense matching between images captured by a moving monocular camera along the epipolar line. We utilize the learned features from our model for depth estimation inside a real-time dense monocular SLAM framework, where photometric error is replaced by our learned descriptor error. Our evaluation on several challenging indoor scenes demonstrate greatly improved accuracy in dense reconstructions of the well celebrated dense SLAM systems like DTAM, without compromising their real-time performance.

READ FULL TEXT

page 1

page 2

page 4

page 6

page 7

page 8

research
06/29/2019

SLAM Endoscopy enhanced by adversarial depth prediction

Medical endoscopy remains a challenging application for simultaneous loc...
research
11/02/2020

SLAM in the Field: An Evaluation of Monocular Mapping and Localization on Challenging Dynamic Agricultural Environment

This paper demonstrates a system capable of combining a sparse, indirect...
research
03/02/2020

Extremely Dense Point Correspondences using a Learned Feature Descriptor

High-quality 3D reconstructions from endoscopy video play an important r...
research
03/31/2021

A comparative evaluation of learned feature descriptors on hybrid monocular visual SLAM methods

Classical Visual Simultaneous Localization and Mapping (VSLAM) algorithm...
research
09/09/2019

SE-SLAM: Semi-Dense Structured Edge-Based Monocular SLAM

Vision-based Simultaneous Localization And Mapping (VSLAM) is a mature p...
research
09/27/2022

Orbeez-SLAM: A Real-time Monocular Visual SLAM with ORB Features and NeRF-realized Mapping

A spatial AI that can perform complex tasks through visual signals and c...
research
02/08/2021

Learned Camera Gain and Exposure Control for Improved Visual Feature Detection and Matching

Successful visual navigation depends upon capturing images that contain ...

Please sign up or login with your details

Forgot password? Click here to reset