Head-Free Lightweight Semantic Segmentation with Linear Transformer

01/11/2023
by   Bo Dong, et al.
0

Existing semantic segmentation works have been mainly focused on designing effective decoders; however, the computational load introduced by the overall structure has long been ignored, which hinders their applications on resource-constrained hardwares. In this paper, we propose a head-free lightweight architecture specifically for semantic segmentation, named Adaptive Frequency Transformer. It adopts a parallel architecture to leverage prototype representations as specific learnable local descriptions which replaces the decoder and preserves the rich image semantics on high-resolution features. Although removing the decoder compresses most of the computation, the accuracy of the parallel structure is still limited by low computational resources. Therefore, we employ heterogeneous operators (CNN and Vision Transformer) for pixel embedding and prototype representations to further save computational costs. Moreover, it is very difficult to linearize the complexity of the vision Transformer from the perspective of spatial domain. Due to the fact that semantic segmentation is very sensitive to frequency information, we construct a lightweight prototype learning block with adaptive frequency filter of complexity O(n) to replace standard self attention with O(n^2). Extensive experiments on widely adopted datasets demonstrate that our model achieves superior accuracy while retaining only 3M parameters. On the ADE20K dataset, our model achieves 41.8 mIoU and 4.6 GFLOPs, which is 4.4 mIoU higher than Segformer, with 45 78.7 mIoU and 34.4 GFLOPs, which is 2.5 mIoU higher than Segformer with 72.5 less GFLOPs. Code is available at https://github.com/dongbo811/AFFormer.

READ FULL TEXT

page 3

page 4

page 7

research
04/06/2022

PP-LiteSeg: A Superior Real-Time Semantic Segmentation Model

Real-world applications have high demands for semantic segmentation meth...
research
02/21/2023

Lightweight Real-time Semantic Segmentation Network with Efficient Transformer and CNN

In the past decade, convolutional neural networks (CNNs) have shown prom...
research
02/24/2023

A Convolutional Vision Transformer for Semantic Segmentation of Side-Scan Sonar Data

Distinguishing among different marine benthic habitat characteristics is...
research
07/11/2022

Dual Vision Transformer

Prior works have proposed several strategies to reduce the computational...
research
11/20/2019

SINet: Extreme Lightweight Portrait Segmentation Networks with Spatial Squeeze Modules and Information Blocking Decoder

Designing a lightweight and robust portrait segmentation algorithm is an...
research
04/21/2020

Multi-view Self-Constructing Graph Convolutional Networks with Adaptive Class Weighting Loss for Semantic Segmentation

We propose a novel architecture called the Multi-view Self-Constructing ...
research
07/14/2023

HEAL-SWIN: A Vision Transformer On The Sphere

High-resolution wide-angle fisheye images are becoming more and more imp...

Please sign up or login with your details

Forgot password? Click here to reset