Feature Selective Transformer for Semantic Image Segmentation

03/26/2022
by   Fangjian Lin, et al.
0

Recently, it has attracted more and more attentions to fuse multi-scale features for semantic image segmentation. Various works were proposed to employ progressive local or global fusion, but the feature fusions are not rich enough for modeling multi-scale context features. In this work, we focus on fusing multi-scale features from Transformer-based backbones for semantic segmentation, and propose a Feature Selective Transformer (FeSeFormer), which aggregates features from all scales (or levels) for each query feature. Specifically, we first propose a Scale-level Feature Selection (SFS) module, which can choose an informative subset from the whole multi-scale feature set for each scale, where those features that are important for the current scale (or level) are selected and the redundant are discarded. Furthermore, we propose a Full-scale Feature Fusion (FFF) module, which can adaptively fuse features of all scales for queries. Based on the proposed SFS and FFF modules, we develop a Feature Selective Transformer (FeSeFormer), and evaluate our FeSeFormer on four challenging semantic segmentation benchmarks, including PASCAL Context, ADE20K, COCO-Stuff 10K, and Cityscapes, outperforming the state-of-the-art.

READ FULL TEXT

page 5

page 14

page 17

page 18

page 20

page 21

research
05/14/2022

Transformer Scale Gate for Semantic Segmentation

Effectively encoding multi-scale contextual information is crucial for a...
research
10/10/2022

LAPFormer: A Light and Accurate Polyp Segmentation Transformer

Polyp segmentation is still known as a difficult problem due to the larg...
research
03/25/2022

Multi-scale and Cross-scale Contrastive Learning for Semantic Segmentation

This work considers supervised contrastive learning for semantic segment...
research
01/11/2022

Pyramid Fusion Transformer for Semantic Segmentation

The recently proposed MaskFormer <cit.> gives a refreshed perspective on...
research
02/24/2021

Efficient and Accurate Multi-scale Topological Network for Single Image Dehazing

Single image dehazing is a challenging ill-posed problem that has drawn ...
research
05/27/2020

GSTO: Gated Scale-Transfer Operation for Multi-Scale Feature Learning in Pixel Labeling

Existing CNN-based methods for pixel labeling heavily depend on multi-sc...
research
05/02/2023

Exploring vision transformer layer choosing for semantic segmentation

Extensive work has demonstrated the effectiveness of Vision Transformers...

Please sign up or login with your details

Forgot password? Click here to reset