Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers

03/26/2023
by   Zhou Huang, et al.
0

Vision transformers have recently shown strong global context modeling capabilities in camouflaged object detection. However, they suffer from two major limitations: less effective locality modeling and insufficient feature aggregation in decoders, which are not conducive to camouflaged object detection that explores subtle cues from indistinguishable backgrounds. To address these issues, in this paper, we propose a novel transformer-based Feature Shrinkage Pyramid Network (FSPNet), which aims to hierarchically decode locality-enhanced neighboring transformer features through progressive shrinking for camouflaged object detection. Specifically, we propose a nonlocal token enhancement module (NL-TEM) that employs the non-local mechanism to interact neighboring tokens and explore graph-based high-order relations within tokens to enhance local representations of transformers. Moreover, we design a feature shrinkage decoder (FSD) with adjacent interaction modules (AIM), which progressively aggregates adjacent transformer features through a layer-bylayer shrinkage pyramid to accumulate imperceptible but effective cues as much as possible for object information decoding. Extensive quantitative and qualitative experiments demonstrate that the proposed model significantly outperforms the existing 24 competitors on three challenging COD benchmark datasets under six widely-used evaluation metrics. Our code is publicly available at https://github.com/ZhouHuang23/FSPNet.

READ FULL TEXT

page 1

page 3

page 4

page 7

page 8

page 16

page 18

page 21

research
07/27/2021

Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers

Detection transformers have recently shown promising object detection re...
research
07/02/2022

Boundary-Guided Camouflaged Object Detection

Camouflaged object detection (COD), segmenting objects that are elegantl...
research
05/20/2021

Content-Augmented Feature Pyramid Network with Light Linear Transformers

Recently, plenty of work has tried to introduce transformers into comput...
research
09/06/2022

PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection

Recent years have witnessed a trend of applying context frames to boost ...
research
03/20/2022

Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows

This paper presents a new vision Transformer, named Iwin Transformer, wh...
research
08/05/2021

Unifying Global-Local Representations in Salient Object Detection with Transformer

The fully convolutional network (FCN) has dominated salient object detec...
research
08/03/2021

Vision Transformer with Progressive Sampling

Transformers with powerful global relation modeling abilities have been ...

Please sign up or login with your details

Forgot password? Click here to reset