DeepAI AI Chat
Log In Sign Up

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

12/31/2020
by   Sixiao Zheng, et al.
29

Most recent semantic segmentation methods adopt a fully-convolutional network (FCN) with an encoder-decoder architecture. The encoder progressively reduces the spatial resolution and learns more abstract/semantic visual concepts with larger receptive fields. Since context modeling is critical for segmentation, the latest efforts have been focused on increasing the receptive field, through either dilated/atrous convolutions or inserting attention modules. However, the encoder-decoder based FCN architecture remains unchanged. In this paper, we aim to provide an alternative perspective by treating semantic segmentation as a sequence-to-sequence prediction task. Specifically, we deploy a pure transformer (ie, without convolution and resolution reduction) to encode an image as a sequence of patches. With the global context modeled in every layer of the transformer, this encoder can be combined with a simple decoder to provide a powerful segmentation model, termed SEgmentation TRansformer (SETR). Extensive experiments show that SETR achieves new state of the art on ADE20K (50.28 Cityscapes. Particularly, we achieve the first (44.42 highly competitive ADE20K test server leaderboard.

READ FULL TEXT

page 3

page 6

page 7

page 8

page 11

page 12

07/19/2022

Visual Representation Learning with Transformer: A Sequence-to-Sequence Perspective

Visual representation learning is the key of solving various vision prob...
04/25/2021

Transformer Meets DCFAM: A Novel Semantic Segmentation Scheme for Fine-Resolution Remote Sensing Images

The fully-convolutional network (FCN) with an encoder-decoder architectu...
06/16/2022

Simple and Efficient Architectures for Semantic Segmentation

Though the state-of-the architectures for semantic segmentation, such as...
03/26/2022

Semantic Segmentation by Early Region Proxy

Typical vision backbones manipulate structured features. As a compromise...
01/21/2021

Trans2Seg: Transparent Object Segmentation with Transformer

This work presents a new fine-grained transparent object segmentation da...
09/04/2018

Deep Smoke Segmentation

Inspired by the recent success of fully convolutional networks (FCN) in ...
05/24/2017

Dense Transformer Networks

The key idea of current deep learning methods for dense prediction is to...

Code Repositories

setr-pytorch

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers


view repo

SETR-pytorch

Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.


view repo

SETR

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers


view repo