Semantically-Aware Attentive Neural Embeddings for Image-based Visual Localization

12/08/2018
by   Zachary Seymour, et al.
0

We present a novel method for fusing appearance and semantic information using visual attention for 2D image-based localization (2D-VL) across extreme changes in viewing conditions. Our deep learning based method is motivated by the intuition that specific scene regions remain stable in the semantic modality even in the presence of vast differences in the appearance modality. The proposed attention-based module learns to focus not only on discriminative visual regions for place recognition but also on consistently stable semantic regions to perform 2D-VL. We show the effectiveness of this model by comparing against state-of-the-art (SOTA) methods on several challenging localization datasets. We report an average absolute improvement of 19 2D-VL methods. Furthermore, we present an extensive study demonstrating the effectiveness and contribution of each component of our model, showing 8 absolute improvement from adding semantic information, and an additional 4 from our proposed attention module, over both prior methods as well as a competitive baseline.

READ FULL TEXT

page 8

page 13

research
04/08/2019

Visual Localization Using Sparse Semantic 3D Map

Accurate and robust visual localization under a wide range of viewing co...
research
07/18/2022

A Semantic-aware Attention and Visual Shielding Network for Cloth-changing Person Re-identification

Cloth-changing person reidentification (ReID) is a newly emerging resear...
research
04/26/2023

Multi-Modality Deep Network for Extreme Learned Image Compression

Image-based single-modality compression learning approaches have demonst...
research
04/01/2018

Differential Attention for Visual Question Answering

In this paper we aim to answer questions based on images when provided w...
research
05/20/2022

Structured Attention Composition for Temporal Action Localization

Temporal action localization aims at localizing action instances from un...
research
09/09/2021

S3G-ARM: Highly Compressive Visual Self-localization from Sequential Semantic Scene Graph Using Absolute and Relative Measurements

In this paper, we address the problem of image sequence-based self-local...
research
11/17/2018

Sequential Image-based Attention Network for Inferring Force Estimation without Haptic Sensor

Humans can infer approximate interaction force between objects from only...

Please sign up or login with your details

Forgot password? Click here to reset