D2S: Representing local descriptors and global scene coordinates for camera relocalization

by   Bach-Thuan Bui, et al.

State-of-the-art visual localization methods mostly rely on complex procedures to match local descriptors and 3D point clouds. However, these procedures can incur significant cost in terms of inference, storage, and updates over time. In this study, we propose a direct learning-based approach that utilizes a simple network named D2S to represent local descriptors and their scene coordinates. Our method is characterized by its simplicity and cost-effectiveness. It solely leverages a single RGB image for localization during the testing phase and only requires a lightweight model to encode a complex sparse scene. The proposed D2S employs a combination of a simple loss function and graph attention to selectively focus on robust descriptors while disregarding areas such as clouds, trees, and several dynamic objects. This selective attention enables D2S to effectively perform a binary-semantic classification for sparse descriptors. Additionally, we propose a new outdoor dataset to evaluate the capabilities of visual localization methods in terms of scene generalization and self-updating from unlabeled observations. Our approach outperforms the state-of-the-art CNN-based methods in scene coordinate regression in indoor and outdoor environments. It demonstrates the ability to generalize beyond training data, including scenarios involving transitions from day to night and adapting to domain shifts, even in the absence of the labeled data sources. The source code, trained models, dataset, and demo videos are available at the following link: https://thpjp.github.io/d2s


page 1

page 5

page 6

page 7

page 10

page 13

page 14


HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer

Visual localization is critical to many applications in computer vision ...

Distinctive 3D local deep descriptors

We present a simple but yet effective method for learning distinctive 3D...

Hierarchical Joint Scene Coordinate Classification and Regression for Visual Localization

Visual localization is pivotal to many applications in computer vision a...

Fast and Lightweight Scene Regressor for Camera Relocalization

Camera relocalization involving a prior 3D reconstruction plays a crucia...

Crowd Source Scene Change Detection and Local Map Update

As scene changes with time map descriptors become outdated, affecting VP...

Learning to Localize in New Environments from Synthetic Training Data

Most existing approaches for visual localization either need a detailed ...

Backtracking Regression Forests for Accurate Camera Relocalization

Camera relocalization plays a vital role in many robotics and computer v...

Please sign up or login with your details

Forgot password? Click here to reset