SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation

01/30/2023
by   Qiang Wan, et al.
0

Since the introduction of Vision Transformers, the landscape of many computer vision tasks (e.g., semantic segmentation), which has been overwhelmingly dominated by CNNs, recently has significantly revolutionized. However, the computational cost and memory requirement render these methods unsuitable on the mobile device, especially for the high-resolution per-pixel semantic segmentation task. In this paper, we introduce a new method squeeze-enhanced Axial TransFormer (SeaFormer) for mobile semantic segmentation. Specifically, we design a generic attention block characterized by the formulation of squeeze Axial and detail enhancement. It can be further used to create a family of backbone architectures with superior cost-effectiveness. Coupled with a light segmentation head, we achieve the best trade-off between segmentation accuracy and latency on the ARM-based mobile devices on the ADE20K and Cityscapes datasets. Critically, we beat both the mobile-friendly rivals and Transformer-based counterparts with better performance and lower latency without bells and whistles. Beyond semantic segmentation, we further apply the proposed SeaFormer architecture to image classification problem, demonstrating the potentials of serving as a versatile mobile-friendly backbone.

READ FULL TEXT

page 17

page 18

page 19

research
04/12/2022

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation

Although vision transformers (ViTs) have achieved great success in compu...
research
04/11/2023

PP-MobileSeg: Explore the Fast and Accurate Semantic Segmentation Model on Mobile Devices

The success of transformers in computer vision has led to several attemp...
research
06/07/2021

Multi-Exit Semantic Segmentation Networks

Semantic segmentation arises as the backbone of many vision systems, spa...
research
04/27/2020

Compact retail shelf segmentation for mobile deployment

The recent surge of automation in the retail industries has rapidly incr...
research
11/20/2018

CGNet: A Light-weight Context Guided Network for Semantic Segmentation

The demand of applying semantic segmentation model on mobile devices has...
research
03/24/2023

FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization

The recent amalgamation of transformer and convolutional designs has led...
research
11/18/2021

Dynamically pruning segformer for efficient semantic segmentation

As one of the successful Transformer-based models in computer vision tas...

Please sign up or login with your details

Forgot password? Click here to reset