LSTM-CF: Unifying Context Modeling and Fusion with LSTMs for RGB-D Scene Labeling

04/18/2016
by   Zhen Li, et al.
0

Semantic labeling of RGB-D scenes is crucial to many intelligent applications including perceptual robotics. It generates pixelwise and fine-grained label maps from simultaneously sensed photometric (RGB) and depth channels. This paper addresses this problem by i) developing a novel Long Short-Term Memorized Context Fusion (LSTM-CF) Model that captures and fuses contextual information from multiple channels of photometric and depth data, and ii) incorporating this model into deep convolutional neural networks (CNNs) for end-to-end training. Specifically, contexts in photometric and depth channels are, respectively, captured by stacking several convolutional layers and a long short-term memory layer; the memory layer encodes both short-range and long-range spatial dependencies in an image along the vertical direction. Another long short-term memorized fusion layer is set up to integrate the contexts along the vertical direction from different channels, and perform bi-directional propagation of the fused vertical contexts along the horizontal direction to obtain true 2D global contexts. At last, the fused contextual representation is concatenated with the convolutional features extracted from the photometric channels in order to improve the accuracy of fine-scale semantic labeling. Our proposed model has set a new state of the art, i.e., 48.1 improvement) on the large-scale SUNRGBD dataset and the NYUDv2dataset, respectively.

READ FULL TEXT

page 11

page 13

page 14

research
11/14/2015

Semantic Object Parsing with Local-Global Long Short-Term Memory

Semantic object parsing is a fundamental task for understanding objects ...
research
04/07/2016

Geometric Scene Parsing with Hierarchical LSTM

This paper addresses the problem of geometric scene parsing, i.e. simult...
research
09/17/2018

Learning of Multi-Context Models for Autonomous Underwater Vehicles

Multi-context model learning is crucial for marine robotics where severa...
research
07/08/2016

Multi-level Contextual RNNs with Attention Model for Scene Labeling

Context in image is crucial for scene labeling while existing methods on...
research
10/26/2020

A Joint Convolutional and Spatial Quad-Directional LSTM Network for Phase Unwrapping

Phase unwrapping is a classical ill-posed problem which aims to recover ...
research
03/28/2016

Deep Embedding for Spatial Role Labeling

This paper introduces the visually informed embedding of word (VIEW), a ...
research
06/04/2018

CFCM: Segmentation via Coarse to Fine Context Memory

Recent neural-network-based architectures for image segmentation make ex...

Please sign up or login with your details

Forgot password? Click here to reset