VS-Net: Voting with Segmentation for Visual Localization

by   Zhaoyang Huang, et al.

Visual localization is of great importance in robotics and computer vision. Recently, scene coordinate regression based methods have shown good performance in visual localization in small static scenes. However, it still estimates camera poses from many inferior scene coordinates. To address this problem, we propose a novel visual localization framework that establishes 2D-to-3D correspondences between the query image and the 3D map with a series of learnable scene-specific landmarks. In the landmark generation stage, the 3D surfaces of the target scene are over-segmented into mosaic patches whose centers are regarded as the scene-specific landmarks. To robustly and accurately recover the scene-specific landmarks, we propose the Voting with Segmentation Network (VS-Net) to segment the pixels into different landmark patches with a segmentation branch and estimate the landmark locations within each patch with a landmark location voting branch. Since the number of landmarks in a scene may reach up to 5000, training a segmentation network with such a large number of classes is both computation and memory costly for the commonly used cross-entropy loss. We propose a novel prototype-based triplet loss with hard negative mining, which is able to train semantic segmentation networks with a large number of labels efficiently. Our proposed VS-Net is extensively tested on multiple public benchmarks and can outperform state-of-the-art visual localization methods. Code and models are available at \href{https://github.com/zju3dv/VS-Net}{https://github.com/zju3dv/VS-Net}.


page 2

page 4

page 6

page 7


Vertebra-Focused Landmark Detection for Scoliosis Assessment

Adolescent idiopathic scoliosis (AIS) is a lifetime disease that arises ...

HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer

Visual localization is critical to many applications in computer vision ...

One-Shot Object Localization in Medical Images based on Relative Position Regression

Deep learning networks have shown promising performance for accurate obj...

Visual Localization via Few-Shot Scene Region Classification

Visual (re)localization addresses the problem of estimating the 6-DoF (D...

NeRF-Loc: Visual Localization with Conditional Neural Radiance Field

We propose a novel visual re-localization method based on direct matchin...

Location-Sensitive Visual Recognition with Cross-IOU Loss

Object detection, instance segmentation, and pose estimation are popular...

Efficient Map Sparsification Based on 2D and 3D Discretized Grids

Localization in a pre-built map is a basic technique for robot autonomou...

Please sign up or login with your details

Forgot password? Click here to reset