CaseNet: Content-Adaptive Scale Interaction Networks for Scene Parsing

04/17/2019
by   Xin Jin, et al.
0

Objects in an image exhibit diverse scales. Adaptive receptive fields are expected to catch suitable range of context for accurate pixel level semantic prediction for handling objects of diverse sizes. Recently, atrous convolution with different dilation rates has been used to generate features of multi-scales through several branches and these features are fused for prediction. However, there is a lack of explicit interaction among the branches to adaptively make full use of the contexts. In this paper, we propose a Content-Adaptive Scale Interaction Network (CaseNet) to exploit the multi-scale features for scene parsing. We build the CaseNet based on the classic Atrous Spatial Pyramid Pooling (ASPP) module, followed by the proposed contextual scale interaction (CSI) module, and the scale adaptation (SA) module. Specifically, first, for each spatial position, we enable context interaction among different scales through scale-aware non-local operations across the scales, , CSI module, which facilitates the generation of flexible mixed receptive fields, instead of a traditional flat one. Second, the scale adaptation module (SA) explicitly and softly selects the suitable scale for each spatial position and each channel. Ablation studies demonstrate the effectiveness of the proposed modules. We achieve state-of-the-art performance on three scene parsing benchmarks Cityscapes, ADE20K and LIP.

READ FULL TEXT

page 1

page 5

page 6

research
12/04/2016

Pyramid Scene Parsing Network

Scene parsing is challenging for unrestricted open vocabulary and divers...
research
04/25/2019

Multi-scale Cross-form Pyramid Network for Stereo Matching

Stereo matching plays an indispensable part in autonomous driving, robot...
research
07/07/2019

ASCNet: Adaptive-Scale Convolutional Neural Networks for Multi-Scale Feature Learning

Extracting multi-scale information is key to semantic segmentation. Howe...
research
11/05/2019

Adaptive Context Network for Scene Parsing

Recent works attempt to improve scene parsing performance by exploring d...
research
09/14/2020

GINet: Graph Interaction Network for Scene Parsing

Recently, context reasoning using image regions beyond local convolution...
research
03/14/2021

SaNet: Scale-aware neural Network for Parsing Multiple Spatial Resolution Aerial Images

Assigning the geospatial objects of aerial images with categorical infor...
research
07/18/2020

Feature Pyramid Transformer

Feature interactions across space and scales underpin modern visual reco...

Please sign up or login with your details

Forgot password? Click here to reset