Integrative Feature and Cost Aggregation with Transformers for Dense Correspondence

09/19/2022
by   Sunghwan Hong, et al.
0

We present a novel architecture for dense correspondence. The current state-of-the-art are Transformer-based approaches that focus on either feature descriptors or cost volume aggregation. However, they generally aggregate one or the other but not both, though joint aggregation would boost each other by providing information that one has but other lacks, i.e., structural or semantic information of an image, or pixel-wise matching similarity. In this work, we propose a novel Transformer-based network that interleaves both forms of aggregations in a way that exploits their complementary information. Specifically, we design a self-attention layer that leverages the descriptor to disambiguate the noisy cost volume and that also utilizes the cost volume to aggregate features in a manner that promotes accurate matching. A subsequent cross-attention layer performs further aggregation conditioned on the descriptors of both images and aided by the aggregated outputs of earlier layers. We further boost the performance with hierarchical processing, in which coarser level aggregations guide those at finer levels. We evaluate the effectiveness of the proposed method on dense matching tasks and achieve state-of-the-art performance on all the major benchmarks. Extensive ablation studies are also provided to validate our design choices.

READ FULL TEXT

page 6

page 7

page 16

page 17

page 18

page 19

research
04/01/2021

LoFTR: Detector-Free Local Feature Matching with Transformers

We present a novel method for local image feature matching. Instead of p...
research
12/22/2021

Cost Aggregation Is All You Need for Few-Shot Segmentation

We introduce a novel cost aggregation network, dubbed Volumetric Aggrega...
research
07/22/2022

Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation

This paper presents a novel cost aggregation network, called Volumetric ...
research
06/04/2021

Semantic Correspondence with Transformers

We propose a novel cost aggregation network, called Cost Aggregation wit...
research
07/05/2021

What Makes for Hierarchical Vision Transformer?

Recent studies show that hierarchical Vision Transformer with interleave...
research
02/14/2022

CATs++: Boosting Cost Aggregation with Convolutions and Transformers

Cost aggregation is a highly important process in image matching tasks, ...
research
03/04/2022

Characterizing Renal Structures with 3D Block Aggregate Transformers

Efficiently quantifying renal structures can provide distinct spatial co...

Please sign up or login with your details

Forgot password? Click here to reset