Feature Pyramid Transformer

07/18/2020
by   Dong Zhang, et al.
18

Feature interactions across space and scales underpin modern visual recognition systems because they introduce beneficial visual contexts. Conventionally, spatial contexts are passively hidden in the CNN's increasing receptive fields or actively encoded by non-local convolution. Yet, the non-local spatial interactions are not across scales, and thus they fail to capture the non-local contexts of objects (or parts) residing in different scales. To this end, we propose a fully active feature interaction across both space and scales, called Feature Pyramid Transformer (FPT). It transforms any feature pyramid into another feature pyramid of the same size but with richer contexts, by using three specially designed transformers in self-level, top-down, and bottom-up interaction fashion. FPT serves as a generic visual backbone with fair computational overhead. We conduct extensive experiments in both instance-level (i.e., object detection and instance segmentation) and pixel-level segmentation tasks, using various backbones and head networks, and observe consistent improvement over all the baselines and the state-of-the-art methods.

READ FULL TEXT

page 11

page 14

page 24

page 25

page 26

research
05/20/2021

Content-Augmented Feature Pyramid Network with Light Linear Transformers

Recently, plenty of work has tried to introduce transformers into comput...
research
08/22/2020

PNEN: Pyramid Non-Local Enhanced Networks

Existing neural networks proposed for low-level image processing tasks a...
research
08/15/2021

SOTR: Segmenting Objects with Transformers

Most recent transformer-based models show impressive performance on visi...
research
03/29/2022

NL-FCOS: Improving FCOS through Non-Local Modules for Object Detection

During the last years, we have seen significant advances in the object d...
research
04/17/2019

CaseNet: Content-Adaptive Scale Interaction Networks for Scene Parsing

Objects in an image exhibit diverse scales. Adaptive receptive fields ar...
research
12/07/2020

Fine-Grained Dynamic Head for Object Detection

The Feature Pyramid Network (FPN) presents a remarkable approach to alle...
research
10/05/2022

Centralized Feature Pyramid for Object Detection

Visual feature pyramid has shown its superiority in both effectiveness a...

Please sign up or login with your details

Forgot password? Click here to reset