3D Neighborhood Convolution: Learning Depth-Aware Features for RGB-D and RGB Semantic Segmentation

10/03/2019
by   Yunlu Chen, et al.
30

A key challenge for RGB-D segmentation is how to effectively incorporate 3D geometric information from the depth channel into 2D appearance features. We propose to model the effective receptive field of 2D convolution based on the scale and locality from the 3D neighborhood. Standard convolutions are local in the image space (u, v), often with a fixed receptive field of 3x3 pixels. We propose to define convolutions local with respect to the corresponding point in the 3D real-world space (x, y, z), where the depth channel is used to adapt the receptive field of the convolution, which yields the resulting filters invariant to scale and focusing on the certain range of depth. We introduce 3D Neighborhood Convolution (3DN-Conv), a convolutional operator around 3D neighborhoods. Further, we can use estimated depth to use our RGB-D based semantic segmentation model from RGB input. Experimental results validate that our proposed 3DN-Conv operator improves semantic segmentation, using either ground-truth depth (RGB-D) or estimated depth (RGB).

READ FULL TEXT

page 1

page 2

page 5

page 7

page 8

research
07/18/2020

Malleable 2.5D Convolution: Learning Receptive Fields along the Depth-axis for RGB-D Scene Parsing

Depth data provide geometric information that can bring progress in RGB-...
research
08/05/2017

Depth Adaptive Deep Neural Network for Semantic Segmentation

In this work, we present the depth-adaptive deep neural network using a ...
research
12/04/2018

SurfConv: Bridging 3D and 2D Convolution for RGBD Images

We tackle the problem of using 3D information in convolutional neural ne...
research
03/13/2020

Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation

Online semantic 3D segmentation in company with real-time RGB-D reconstr...
research
12/02/2021

Object-aware Monocular Depth Prediction with Instance Convolutions

With the advent of deep learning, estimating depth from a single RGB ima...
research
05/20/2017

Recurrent Scene Parsing with Perspective Understanding in the Loop

Objects may appear at arbitrary scales in perspective images of a scene,...
research
04/05/2020

Anisotropic Convolutional Networks for 3D Semantic Scene Completion

As a voxel-wise labeling task, semantic scene completion (SSC) tries to ...

Please sign up or login with your details

Forgot password? Click here to reset