Malleable 2.5D Convolution: Learning Receptive Fields along the Depth-axis for RGB-D Scene Parsing

07/18/2020
by   Yajie Xing, et al.
0

Depth data provide geometric information that can bring progress in RGB-D scene parsing tasks. Several recent works propose RGB-D convolution operators that construct receptive fields along the depth-axis to handle 3D neighborhood relations between pixels. However, these methods pre-define depth receptive fields by hyperparameters, making them rely on parameter selection. In this paper, we propose a novel operator called malleable 2.5D convolution to learn the receptive field along the depth-axis. A malleable 2.5D convolution has one or more 2D convolution kernels. Our method assigns each pixel to one of the kernels or none of them according to their relative depth differences, and the assigning process is formulated as a differentiable form so that it can be learnt by gradient descent. The proposed operator runs on standard 2D feature maps and can be seamlessly incorporated into pre-trained CNNs. We conduct extensive experiments on two challenging RGB-D semantic segmentation dataset NYUDv2 and Cityscapes to validate the effectiveness and the generalization ability of our method.

READ FULL TEXT

page 11

page 23

page 24

page 25

research
10/03/2019

3D Neighborhood Convolution: Learning Depth-Aware Features for RGB-D and RGB Semantic Segmentation

A key challenge for RGB-D segmentation is how to effectively incorporate...
research
03/19/2018

Depth-aware CNN for RGB-D Segmentation

Convolutional neural networks (CNN) are limited by the lack of capabilit...
research
06/08/2022

Depth-Adapted CNNs for RGB-D Semantic Segmentation

Recent RGB-D semantic segmentation has motivated research interest thank...
research
04/05/2020

Anisotropic Convolutional Networks for 3D Semantic Scene Completion

As a voxel-wise labeling task, semantic scene completion (SSC) tries to ...
research
10/07/2019

Deformable Kernels: Adapting Effective Receptive Fields for Object Deformation

Convolutional networks are not aware of an object's geometric variations...
research
08/26/2019

See More Than Once -- Kernel-Sharing Atrous Convolution for Semantic Segmentation

The state-of-the-art semantic segmentation solutions usually leverage di...
research
06/17/2014

Replicating Kernels with a Short Stride Allows Sparse Reconstructions with Fewer Independent Kernels

In sparse coding it is common to tile an image into nonoverlapping patch...

Please sign up or login with your details

Forgot password? Click here to reset