Recurrent Convolutional Neural Networks for Scene Parsing

06/12/2013
by   Pedro H. O. Pinheiro, et al.
0

Scene parsing is a technique that consist on giving a label to all pixels in an image according to the class they belong to. To ensure a good visual coherence and a high class accuracy, it is essential for a scene parser to capture image long range dependencies. In a feed-forward architecture, this can be simply achieved by considering a sufficiently large input context patch, around each pixel to be labeled. We propose an approach consisting of a recurrent convolutional neural network which allows us to consider a large input context, while limiting the capacity of the model. Contrary to most standard approaches, our method does not rely on any segmentation methods, nor any task-specific features. The system is trained in an end-to-end manner over raw pixels, and models complex spatial dependencies with low inference cost. As the context size increases with the built-in recurrence, the system identifies and corrects its own errors. Our approach yields state-of-the-art performance on both the Stanford Background Dataset and the SIFT Flow Dataset, while remaining very fast at test time.

READ FULL TEXT

page 1

page 8

page 9

page 14

research
11/09/2018

Scene Parsing via Dense Recurrent Neural Networks with Attentional Selection

Recurrent neural networks (RNNs) have shown the ability to improve scene...
research
02/10/2012

Scene Parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers

Scene parsing, or semantic segmentation, consists in labeling each pixel...
research
11/15/2014

Deep Deconvolutional Networks for Scene Parsing

Scene parsing is an important and challenging prob- lem in computer visi...
research
09/23/2017

Semi-Supervised Hierarchical Semantic Object Parsing

Models based on Convolutional Neural Networks (CNNs) have been proven ve...
research
09/14/2019

Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations

Learning representations that accurately capture long-range dependencies...
research
11/28/2016

Social Scene Understanding: End-to-End Multi-Person Action Localization and Collective Activity Recognition

We present a unified framework for understanding human social behaviors ...
research
04/20/2016

Scene Parsing with Integration of Parametric and Non-parametric Models

We adopt Convolutional Neural Networks (CNNs) to be our parametric model...

Please sign up or login with your details

Forgot password? Click here to reset