Pixel-Level Dense Prediction without Decoder

09/22/2019
by   Xin Cai, et al.
30

Pixel-level dense prediction tasks such as keypoint estimation are dominated by encoder-decoder structures, where the decoder as a vital component is complex and computationally intensive. In contrast, we propose a fully decoding-free pixel-level dense prediction network called FlatteNet, in which the high dimensional tensor outputted by the backbone network is directly flattened to fit the desired output resolution. The proposed FlatteNet is end-to-end differentiable. By removing the decoder unit, FlatteNet requires much fewer parameters and lower computational complexity. We empirically demonstrate the effectiveness of the proposed network through competitive results in human pose estimation on MPII, semantic segmentation on PASCAL-Context, and object detection on PASCAL VOC. We hope that the proposed FlatteNet can serve as a simple and strong alternative of current mainstream decoder-based pixel-level dense prediction networks.

READ FULL TEXT

page 5

page 7

page 9

research
11/19/2020

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

Contrastive learning methods for unsupervised visual representation lear...
research
06/01/2021

Full-Resolution Encoder-Decoder Networks with Multi-Scale Feature Fusion for Human Pose Estimation

To achieve more accurate 2D human pose estimation, we extend the success...
research
08/26/2019

SPGNet: Semantic Prediction Guidance for Scene Parsing

Multi-scale context module and single-stage encoder-decoder structure ar...
research
04/16/2019

SparseMask: Differentiable Connectivity Learning for Dense Image Prediction

In this paper, we aim at automatically searching an efficient network ar...
research
04/01/2021

Confidence Adaptive Anytime Pixel-Level Recognition

Anytime inference requires a model to make a progression of predictions ...
research
06/20/2021

More than Encoder: Introducing Transformer Decoder to Upsample

General segmentation models downsample images and then upsample to resto...
research
01/11/2019

FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction

The basic principles in designing convolutional neural network (CNN) str...

Please sign up or login with your details

Forgot password? Click here to reset