ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer

08/30/2022
by   Hongkai Chen, et al.
5

Generating robust and reliable correspondences across images is a fundamental task for a diversity of applications. To capture context at both global and local granularity, we propose ASpanFormer, a Transformer-based detector-free matcher that is built on hierarchical attention structure, adopting a novel attention operation which is capable of adjusting attention span in a self-adaptive manner. To achieve this goal, first, flow maps are regressed in each cross attention phase to locate the center of search region. Next, a sampling grid is generated around the center, whose size, instead of being empirically configured as fixed, is adaptively computed from a pixel uncertainty estimated along with the flow map. Finally, attention is computed across two images within derived regions, referred to as attention span. By these means, we are able to not only maintain long-range dependencies, but also enable fine-grained attention among pixels of high relevance that compensates essential locality and piece-wise smoothness in matching tasks. State-of-the-art accuracy on a wide range of evaluation benchmarks validates the strong matching capability of our method.

READ FULL TEXT

page 2

page 11

page 14

page 20

page 21

research
01/08/2023

DeepMatcher: A Deep Transformer-based Network for Robust and Accurate Local Feature Matching

Local feature matching between images remains a challenging task, especi...
research
07/05/2022

Efficient Representation Learning via Adaptive Context Pooling

Self-attention mechanisms model long-range context by using pairwise att...
research
05/19/2019

Adaptive Attention Span in Transformers

We propose a novel self-attention mechanism that can learn its optimal a...
research
11/30/2022

From Coarse to Fine: Hierarchical Pixel Integration for Lightweight Image Super-Resolution

Image super-resolution (SR) serves as a fundamental tool for the process...
research
05/08/2021

Long-Span Dependencies in Transformer-based Summarization Systems

Transformer-based models have achieved state-of-the-art results in a wid...
research
10/23/2021

An attention-driven hierarchical multi-scale representation for visual recognition

Convolutional Neural Networks (CNNs) have revolutionized the understandi...
research
06/08/2021

LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution Homography Estimation

Cross-resolution image alignment is a key problem in multiscale gigapixe...

Please sign up or login with your details

Forgot password? Click here to reset