Exploring Self-Attention for Visual Odometry

11/17/2020
by   Hamed Damirchi, et al.
12

Visual odometry networks commonly use pretrained optical flow networks in order to derive the ego-motion between consecutive frames. The features extracted by these networks represent the motion of all the pixels between frames. However, due to the existence of dynamic objects and texture-less surfaces in the scene, the motion information for every image region might not be reliable for inferring odometry due to the ineffectiveness of dynamic objects in derivation of the incremental changes in position. Recent works in this area lack attention mechanisms in their structures to facilitate dynamic reweighing of the feature maps for extracting more refined egomotion information. In this paper, we explore the effectiveness of self-attention in visual odometry. We report qualitative and quantitative results against the SOTA methods. Furthermore, saliency-based studies alongside specially designed experiments are utilized to investigate the effect of self-attention on VO. Our experiments show that using self-attention allows for the extraction of better features while achieving a better odometry performance compared to networks that lack such structures.

READ FULL TEXT

page 1

page 4

page 7

page 8

research
05/12/2022

Dynamic Dense RGB-D SLAM using Learning-based Visual Odometry

We propose a dense dynamic RGB-D SLAM pipeline based on a learning-based...
research
09/12/2023

Self-supervised Extraction of Human Motion Structures via Frame-wise Discrete Features

The present paper proposes an encoder-decoder model for extracting the s...
research
07/19/2019

Robust Real-time RGB-D Visual Odometry in Dynamic Environments via Rigid Motion Model

In the paper, we propose a robust real-time visual odometry in dynamic e...
research
07/16/2019

Scene Motion Decomposition for Learnable Visual Odometry

Optical Flow (OF) and depth are commonly used for visual odometry since ...
research
11/02/2021

Relational Self-Attention: What's Missing in Attention for Video Understanding

Convolution has been arguably the most important feature transform for m...
research
01/07/2020

AD-VO: Scale-Resilient Visual Odometry Using Attentive Disparity Map

Visual odometry is an essential key for a localization module in SLAM sy...
research
08/08/2023

Exploring the Spatiotemporal Features of Online Food Recommendation Service

Online Food Recommendation Service (OFRS) has remarkable spatiotemporal ...

Please sign up or login with your details

Forgot password? Click here to reset