Graph Attention Network for Camera Relocalization on Dynamic Scenes

by   Mohamed Amine Ouali, et al.

We devise a graph attention network-based approach for learning a scene triangle mesh representation in order to estimate an image camera position in a dynamic environment. Previous approaches built a scene-dependent model that explicitly or implicitly embeds the structure of the scene. They use convolution neural networks or decision trees to establish 2D/3D-3D correspondences. Such a mapping overfits the target scene and does not generalize well to dynamic changes in the environment. Our work introduces a novel approach to solve the camera relocalization problem by using the available triangle mesh. Our 3D-3D matching framework consists of three blocks: (1) a graph neural network to compute the embedding of mesh vertices, (2) a convolution neural network to compute the embedding of grid cells defined on the RGB-D image, and (3) a neural network model to establish the correspondence between the two embeddings. These three components are trained end-to-end. To predict the final pose, we run the RANSAC algorithm to generate camera pose hypotheses, and we refine the prediction using the point-cloud representation. Our approach significantly improves the camera pose accuracy of the state-of-the-art method from 0.358 to 0.506 on the RIO10 benchmark for dynamic indoor camera relocalization.


page 1

page 2

page 4

page 10


S3E-GNN: Sparse Spatial Scene Embedding with Graph Neural Networks for Camera Relocalization

Camera relocalization is the key component of simultaneous localization ...

Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments

Localizing the camera in a known indoor environment is a key building bl...

Visual Camera Re-Localization from RGB and RGB-D Images Using DSAC

We describe a learning-based system that estimates the camera position a...

Automatic Co-Registration of Aerial Imagery and Untextured Model Data Utilizing Average Shading Gradients

The comparison of current image data with existing 3D model data of a sc...

Incremental Visual-Inertial 3D Mesh Generation with Structural Regularities

Visual-Inertial Odometry (VIO) algorithms typically rely on a point clou...

End-to-end Deformable Attention Graph Neural Network for Single-view Liver Mesh Reconstruction

Intensity modulated radiotherapy (IMRT) is one of the most common modali...

UAN: Unified Attention Network for Convolutional Neural Networks

We propose a new architecture that learns to attend to different Convolu...

Please sign up or login with your details

Forgot password? Click here to reset