Learning Sequence Descriptor based on Spatiotemporal Attention for Visual Place Recognition

05/19/2023
by   Fenglin Zhang, et al.
0

Sequence-based visual place recognition (sVPR) aims to match frame sequences with frames stored in a reference map for localization. Existing methods include sequence matching and sequence descriptor-based retrieval. The former is based on the assumption of constant velocity, which is difficult to hold in real scenarios and does not get rid of the intrinsic single frame descriptor mismatch. The latter solves this problem by extracting a descriptor for the whole sequence, but current sequence descriptors are only constructed by feature aggregation of multi-frames, with no temporal information interaction. In this paper, we propose a sequential descriptor extraction method to fuse spatiotemporal information effectively and generate discriminative descriptors. Specifically, similar features on the same frame focu on each other and learn space structure, and the same local regions of different frames learn local feature changes over time. And we use sliding windows to control the temporal self-attention range and adpot relative position encoding to construct the positional relationships between different features, which allows our descriptor to capture the inherent dynamics in the frame sequence and local feature motion.

READ FULL TEXT

page 3

page 4

page 6

research
06/10/2020

Delta Descriptors: Change-Based Place Representation for Robust Visual Localization

Visual place recognition is challenging because there are so many factor...
research
01/15/2013

A Geometric Descriptor for Cell-Division Detection

We describe a method for cell-division detection based on a geometric-dr...
research
03/11/2015

Appearance-based indoor localization: A comparison of patch descriptor performance

Vision is one of the most important of the senses, and humans use it ext...
research
02/25/2019

Condition-Invariant Multi-View Place Recognition

Visual place recognition is particularly challenging when places suffer ...
research
10/15/2019

Trajectorylet-Net: a novel framework for pose prediction based on trajectorylet descriptors

Pose prediction is an increasingly interesting topic in computer vision ...
research
11/22/2022

FE-Fusion-VPR: Attention-based Multi-Scale Network Architecture for Visual Place Recognition by Fusing Frames and Events

Traditional visual place recognition (VPR), usually using standard camer...

Please sign up or login with your details

Forgot password? Click here to reset