SPGNet: Semantic Prediction Guidance for Scene Parsing

08/26/2019
by   Bowen Cheng, et al.
16

Multi-scale context module and single-stage encoder-decoder structure are commonly employed for semantic segmentation. The multi-scale context module refers to the operations to aggregate feature responses from a large spatial extent, while the single-stage encoder-decoder structure encodes the high-level semantic information in the encoder path and recovers the boundary information in the decoder path. In contrast, multi-stage encoder-decoder networks have been widely used in human pose estimation and show superior performance than their single-stage counterpart. However, few efforts have been attempted to bring this effective design to semantic segmentation. In this work, we propose a Semantic Prediction Guidance (SPG) module which learns to re-weight the local features through the guidance from pixel-wise semantic prediction. We find that by carefully re-weighting features across stages, a two-stage encoder-decoder network coupled with our proposed SPG module can significantly outperform its one-stage counterpart with similar parameters and computations. Finally, we report experimental results on the semantic segmentation benchmark Cityscapes, in which our SPGNet attains 81.1 annotations.

READ FULL TEXT
research
06/01/2021

Full-Resolution Encoder-Decoder Networks with Multi-Scale Feature Fusion for Human Pose Estimation

To achieve more accurate 2D human pose estimation, we extend the success...
research
06/28/2021

Fractal Pyramid Networks

We propose a new network architecture, the Fractal Pyramid Networks (PFN...
research
01/11/2023

LENet: Lightweight And Efficient LiDAR Semantic Segmentation Using Multi-Scale Convolution Attention

LiDAR semantic segmentation can provide vehicles with a rich understandi...
research
05/22/2019

Learning Fully Dense Neural Networks for Image Semantic Segmentation

Semantic segmentation is pixel-wise classification which retains critica...
research
11/10/2022

Efficient Unsupervised Video Object Segmentation Network Based on Motion Guidance

Considerable unsupervised video object segmentation algorithms based on ...
research
09/22/2019

Pixel-Level Dense Prediction without Decoder

Pixel-level dense prediction tasks such as keypoint estimation are domin...
research
07/26/2021

Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition

Although text recognition has significantly evolved over the years, stat...

Please sign up or login with your details

Forgot password? Click here to reset