MoE-SPNet: A Mixture-of-Experts Scene Parsing Network

06/19/2018
by   Huan Fu, et al.
2

Scene parsing is an indispensable component in understanding the semantics within a scene. Traditional methods rely on handcrafted local features and probabilistic graphical models to incorporate local and global cues. Recently, methods based on fully convolutional neural networks have achieved new records on scene parsing. An important strategy common to these methods is the aggregation of hierarchical features yielded by a deep convolutional neural network. However, typical algorithms usually aggregate hierarchical convolutional features via concatenation or linear combination, which cannot sufficiently exploit the diversities of contextual information in multi-scale features and the spatial inhomogeneity of a scene. In this paper, we propose a mixture-of-experts scene parsing network (MoE-SPNet) that incorporates a convolutional mixture-of-experts layer to assess the importance of features from different levels and at different spatial locations. In addition, we propose a variant of mixture-of-experts called the adaptive hierarchical feature aggregation (AHFA) mechanism which can be incorporated into existing scene parsing networks that use skip-connections to fuse features layer-wisely. In the proposed networks, different levels of features at each spatial location are adaptively re-weighted according to the local structure and surrounding contextual information before aggregation. We demonstrate the effectiveness of the proposed methods on two scene parsing datasets including PASCAL VOC 2012 and SceneParse150 based on two kinds of baseline models FCN-8s and DeepLab-ASPP.

READ FULL TEXT

page 17

page 18

page 20

research
11/04/2020

Multi-layer Feature Aggregation for Deep Scene Parsing Models

Scene parsing from images is a fundamental yet challenging problem in vi...
research
11/05/2019

Adaptive Context Network for Scene Parsing

Recent works attempt to improve scene parsing performance by exploring d...
research
05/24/2019

Multi-Scale Dual-Branch Fully Convolutional Network for Hand Parsing

Recently, fully convolutional neural networks (FCNs) have shown signific...
research
08/27/2016

Multi-Path Feedback Recurrent Neural Network for Scene Parsing

In this paper, we consider the scene parsing problem and propose a novel...
research
10/08/2019

Deep Multiphase Level Set for Scene Parsing

Recently, Fully Convolutional Network (FCN) seems to be the go-to archit...
research
08/30/2022

Boosting Night-time Scene Parsing with Learnable Frequency

Night-Time Scene Parsing (NTSP) is essential to many vision applications...
research
07/25/2017

Relative Depth Order Estimation Using Multi-scale Densely Connected Convolutional Networks

We study the problem of estimating the relative depth order of point pai...

Please sign up or login with your details

Forgot password? Click here to reset