Learning Feature Descriptors using Camera Pose Supervision

04/28/2020
by   Qianqian Wang, et al.
2

Recent research on learned visual descriptors has shown promising improvements in correspondence estimation, a key component of many 3D vision tasks. However, existing descriptor learning frameworks typically require ground-truth correspondences between feature points for training, which are challenging to acquire at scale. In this paper we propose a novel weakly-supervised framework that can learn feature descriptors solely from relative camera poses between images. To do so, we devise both a new loss function that exploits the epipolar constraint given by camera poses, and a new model architecture that makes the whole pipeline differentiable and efficient. Because we no longer need pixel-level ground-truth correspondences, our framework opens up the possibility of training on much larger and more diverse datasets for better and unbiased descriptors. Though trained with weak supervision, our learned descriptors outperform even prior fully-supervised methods and achieve state-of-the-art performance on a variety of geometric tasks.

READ FULL TEXT

page 2

page 4

page 10

page 12

page 21

page 22

research
08/04/2018

Learning to Align Images using Weak Geometric Supervision

Image alignment tasks require accurate pixel correspondences, which are ...
research
10/10/2021

Digging Into Self-Supervised Learning of Feature Descriptors

Fully-supervised CNN-based approaches for learning local image descripto...
research
03/04/2021

Self-supervised Geometric Perception

We present self-supervised geometric perception (SGP), the first general...
research
04/08/2021

3D Surfel Map-Aided Visual Relocalization with Learned Descriptors

In this paper, we introduce a method for visual relocalization using the...
research
08/06/2023

Local Consensus Enhanced Siamese Network with Reciprocal Loss for Two-view Correspondence Learning

Recent studies of two-view correspondence learning usually establish an ...
research
09/12/2022

Self-supervised Wide Baseline Visual Servoing via 3D Equivariance

One of the challenging input settings for visual servoing is when the in...
research
04/10/2021

Deep Weakly Supervised Positioning

PoseNet can map a photo to the position where it is taken, which is appe...

Please sign up or login with your details

Forgot password? Click here to reset