TransCamP: Graph Transformer for 6-DoF Camera Pose Estimation

05/28/2021
by   Xinyi Li, et al.
0

Camera pose estimation or camera relocalization is the centerpiece in numerous computer vision tasks such as visual odometry, structure from motion (SfM) and SLAM. In this paper we propose a neural network approach with a graph transformer backbone, namely TransCamP, to address the camera relocalization problem. In contrast with prior work where the pose regression is mainly guided by photometric consistency, TransCamP effectively fuses the image features, camera pose information and inter-frame relative camera motions into encoded graph attributes and is trained towards the graph consistency and accuracy instead, yielding significantly higher computational efficiency. By leveraging graph transformer layers with edge features and enabling tensorized adjacency matrix, TransCamP dynamically captures the global attention and thus endows the pose graph with evolving structures to achieve improved robustness and accuracy. In addition, optional temporal transformer layers actively enhance the spatiotemporal inter-frame relation for sequential inputs. Evaluation of the proposed network on various public benchmarks demonstrates that TransCamP outperforms state-of-the-art approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2021

GODSAC*: Graph Optimized DSAC* for Robot Relocalization

Deep learning based camera pose estimation from monocular camera images ...
research
12/09/2021

PE-former: Pose Estimation Transformer

Vision transformer architectures have been demonstrated to work very eff...
research
04/26/2023

Graph-CoVis: GNN-based Multi-view Panorama Global Pose Estimation

In this paper, we address the problem of wide-baseline camera pose estim...
research
08/06/2019

Local Supports Global: Deep Camera Relocalization with Sequence Enhancement

We propose to leverage the local information in image sequences to suppo...
research
07/09/2022

Snipper: A Spatiotemporal Transformer for Simultaneous Multi-Person 3D Pose Estimation Tracking and Forecasting on a Video Snippet

Multi-person pose understanding from RGB videos includes three complex t...
research
04/03/2023

RePAST: Relative Pose Attention Scene Representation Transformer

The Scene Representation Transformer (SRT) is a recent method to render ...
research
08/04/2021

Incorporating Learnt Local and Global Embeddings into Monocular Visual SLAM

Traditional approaches for Visual Simultaneous Localization and Mapping ...

Please sign up or login with your details

Forgot password? Click here to reset