Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation

07/22/2022
by   Sunghwan Hong, et al.
0

This paper presents a novel cost aggregation network, called Volumetric Aggregation with Transformers (VAT), for few-shot segmentation. The use of transformers can benefit correlation map aggregation through self-attention over a global receptive field. However, the tokenization of a correlation map for transformer processing can be detrimental, because the discontinuity at token boundaries reduces the local context available near the token edges and decreases inductive bias. To address this problem, we propose a 4D Convolutional Swin Transformer, where a high-dimensional Swin Transformer is preceded by a series of small-kernel convolutions that impart local context to all pixels and introduce convolutional inductive bias. We additionally boost aggregation performance by applying transformers within a pyramidal structure, where aggregation at a coarser level guides aggregation at a finer level. Noise in the transformer output is then filtered in the subsequent decoder with the help of the query's appearance embedding. With this model, a new state-of-the-art is set for all the standard benchmarks in few-shot segmentation. It is shown that VAT attains state-of-the-art performance for semantic correspondence as well, where cost aggregation also plays a central role.

READ FULL TEXT

page 2

page 20

page 23

page 24

page 25

page 26

page 27

research
12/22/2021

Cost Aggregation Is All You Need for Few-Shot Segmentation

We introduce a novel cost aggregation network, dubbed Volumetric Aggrega...
research
02/14/2022

CATs++: Boosting Cost Aggregation with Convolutions and Transformers

Cost aggregation is a highly important process in image matching tasks, ...
research
03/04/2022

Characterizing Renal Structures with 3D Block Aggregate Transformers

Efficiently quantifying renal structures can provide distinct spatial co...
research
09/07/2021

nnFormer: Interleaved Transformer for Volumetric Segmentation

Transformers, the default model of choices in natural language processin...
research
06/04/2021

Semantic Correspondence with Transformers

We propose a novel cost aggregation network, called Cost Aggregation wit...
research
09/19/2022

Integrative Feature and Cost Aggregation with Transformers for Dense Correspondence

We present a novel architecture for dense correspondence. The current st...
research
07/30/2022

Doubly Deformable Aggregation of Covariance Matrices for Few-shot Segmentation

Training semantic segmentation models with few annotated samples has gre...

Please sign up or login with your details

Forgot password? Click here to reset