Depth CNNs for RGB-D scene recognition: learning from scratch better than transferring from RGB-CNNs

01/21/2018
by   Xinhang Song, et al.
1

Scene recognition with RGB images has been extensively studied and has reached very remarkable recognition levels, thanks to convolutional neural networks (CNN) and large scene datasets. In contrast, current RGB-D scene data is much more limited, so often leverages RGB large datasets, by transferring pretrained RGB CNN models and fine-tuning with the target RGB-D dataset. However, we show that this approach has the limitation of hardly reaching bottom layers, which is key to learn modality-specific features. In contrast, we focus on the bottom layers, and propose an alternative strategy to learn depth features combining local weakly supervised training from patches followed by global fine tuning with images. This strategy is capable of learning very discriminative depth-specific features with limited depth images, without resorting to Places-CNN. In addition we propose a modified CNN architecture to further match the complexity of the model and the amount of data available. For RGB-D scene recognition, depth and RGB features are combined by projecting them in a common space and further leaning a multilayer classifier, which is jointly optimized in an end-to-end network. Our framework achieves state-of-the-art accuracy on NYU2 and SUN RGB-D in both depth only and combined RGB-D data.

READ FULL TEXT

page 2

page 3

page 4

research
09/17/2018

Learning Effective RGB-D Representations for Scene Recognition

Deep convolutional networks (CNN) can achieve impressive results on RGB ...
research
11/01/2019

Centroid-Based Scene Classification (CBSC): Using Deep Features and Clustering for RGB-D Indoor Scene Classification

This paper contributes a novel method for RGB-D indoor scene classificat...
research
07/04/2023

Consistent Multimodal Generation via A Unified GAN Framework

We investigate how to generate multimodal image outputs, such as RGB, de...
research
07/24/2015

Multimodal Deep Learning for Robust RGB-D Object Recognition

Robust object recognition is a crucial ingredient of many, if not all, r...
research
03/26/2021

Translate to Adapt: RGB-D Scene Recognition across Domains

Scene classification is one of the basic problems in computer vision res...
research
06/03/2015

Understanding deep features with computer-generated imagery

We introduce an approach for analyzing the variation of features generat...
research
09/30/2016

A deep representation for depth images from synthetic data

Convolutional Neural Networks (CNNs) trained on large scale RGB database...

Please sign up or login with your details

Forgot password? Click here to reset