Decoders Matter for Semantic Segmentation: Data-Dependent Decoding Enables Flexible Feature Aggregation

03/05/2019
by   Zhi Tian, et al.
1

Recent semantic segmentation methods exploit encoder-decoder architectures to produce the desired pixel-wise segmentation prediction. The last layer of the decoders is typically a bilinear upsampling procedure to recover the final pixel-wise prediction. We empirically show that this oversimple and data-independent bilinear upsampling may lead to sub-optimal results. In this work, we propose a data-dependent upsampling (DUpsampling) to replace bilinear, which takes advantages of the redundancy in the label space of semantic segmentation and is able to recover the pixel-wise prediction from low-resolution outputs of CNNs. The main advantage of the new upsampling layer lies in that with a relatively lower-resolution feature map such as 1/16 or 1/32 of the input size, we can achieve even better segmentation accuracy, significantly reducing computation complexity. This is made possible by 1) the new upsampling layer's much improved reconstruction capability; and more importantly 2) the DUpsampling based decoder's flexibility in leveraging almost arbitrary combinations of the CNN encoders' features. Experiments demonstrate that our proposed decoder outperforms the state-of-the-art decoder, with only ∼20% of computation. Finally, without any post-processing, the framework equipped with our proposed decoder achieves new state-of-the-art performance on two datasets: 88.1% mIOU on PASCAL VOC with 30% computation of the previously best model; and 52.5% mIOU on PASCAL Context.

READ FULL TEXT

page 1

page 2

page 7

page 11

page 12

research
11/15/2017

Squeeze-SegNet: A new fast Deep Convolutional Neural Network for Semantic Segmentation

The recent researches in Deep Convolutional Neural Network have focused ...
research
07/18/2017

The Devil is in the Decoder

Many machine vision applications require predictions for every pixel of ...
research
01/03/2020

Segmentation of Cellular Patterns in Confocal Images of Melanocytic Lesions in vivo via a Multiscale Encoder-Decoder Network (MED-Net)

In-vivo optical microscopy is advancing into routine clinical practice f...
research
06/28/2021

Fractal Pyramid Networks

We propose a new network architecture, the Fractal Pyramid Networks (PFN...
research
11/19/2018

OrthoSeg: A Deep Multimodal Convolutional Neural Network for Semantic Segmentation of Orthoimagery

This paper addresses the task of semantic segmentation of orthoimagery u...
research
09/21/2021

CondNet: Conditional Classifier for Scene Segmentation

The fully convolutional network (FCN) has achieved tremendous success in...
research
02/04/2020

DVNet: A Memory-Efficient Three-Dimensional CNN for Large-Scale Neurovascular Reconstruction

Maps of brain microarchitecture are important for understanding neurolog...

Please sign up or login with your details

Forgot password? Click here to reset