Semantic Object Parsing with Local-Global Long Short-Term Memory

11/14/2015
by   Xiaodan Liang, et al.
0

Semantic object parsing is a fundamental task for understanding objects in detail in computer vision community, where incorporating multi-level contextual information is critical for achieving such fine-grained pixel-level recognition. Prior methods often leverage the contextual information through post-processing predicted confidence maps. In this work, we propose a novel deep Local-Global Long Short-Term Memory (LG-LSTM) architecture to seamlessly incorporate short-distance and long-distance spatial dependencies into the feature learning over all pixel positions. In each LG-LSTM layer, local guidance from neighboring positions and global guidance from the whole image are imposed on each position to better exploit complex local and global contextual information. Individual LSTMs for distinct spatial dimensions are also utilized to intrinsically capture various spatial layouts of semantic parts in the images, yielding distinct hidden and memory cells of each position for each dimension. In our parsing approach, several LG-LSTM layers are stacked and appended to the intermediate convolutional layers to directly enhance visual features, allowing network parameters to be learned in an end-to-end way. The long chains of sequential computation by stacked LG-LSTM layers also enable each pixel to sense a much larger region for inference benefiting from the memorization of previous dependencies in all positions along all dimensions. Comprehensive evaluations on three public datasets well demonstrate the significant superiority of our LG-LSTM over other state-of-the-art methods.

READ FULL TEXT

page 7

page 8

research
07/28/2016

A Siamese Long Short-Term Memory Architecture for Human Re-Identification

Matching pedestrians across multiple camera views known as human re-iden...
research
04/18/2016

LSTM-CF: Unifying Context Modeling and Fusion with LSTMs for RGB-D Scene Labeling

Semantic labeling of RGB-D scenes is crucial to many intelligent applica...
research
03/23/2016

Semantic Object Parsing with Graph LSTM

By taking the semantic object parsing task as an exemplar application sc...
research
06/10/2017

Direct detection of pixel-level myocardial infarction areas via a deep-learning algorithm

Accurate detection of the myocardial infarction (MI) area is crucial for...
research
03/08/2017

Interpretable Structure-Evolving LSTM

This paper develops a general framework for learning interpretable data ...
research
01/29/2023

Deep Learning for Human Parsing: A Survey

Human parsing is a key topic in image processing with many applications,...
research
12/14/2016

Neural Emoji Recommendation in Dialogue Systems

Emoji is an essential component in dialogues which has been broadly util...

Please sign up or login with your details

Forgot password? Click here to reset