What Synthesis is Missing: Depth Adaptation Integrated with Weak Supervision for Indoor Scene Parsing

03/23/2019
by   Keng-Chi Liu, et al.
0

Scene Parsing is a crucial step to enable autonomous systems to understand and interact with their surroundings. Supervised deep learning methods have made great progress in solving scene parsing problems, however, come at the cost of laborious manual pixel-level annotation. To alleviate this effort synthetic data as well as weak supervision have both been investigated. Nonetheless, synthetically generated data still suffers from severe domain shift while weak labels are often imprecise. Moreover, most existing works for weakly supervised scene parsing are limited to salient foreground objects. The aim of this work is hence twofold: Exploit synthetic data where feasible and integrate weak supervision where necessary. More concretely, we address this goal by utilizing depth as transfer domain because its synthetic-to-real discrepancy is much lower than for color. At the same time, we perform weak localization from easily obtainable image level labels and integrate both using a novel contour-based scheme. Our approach is implemented as a teacher-student learning framework to solve the transfer learning problem by generating a pseudo ground truth. Using only depth-based adaptation, this approach already outperforms previous transfer learning approaches on the popular indoor scene parsing SUN RGB-D dataset. Our proposed two-stage integration more than halves the gap towards fully supervised methods when compared to previous state-of-the-art in transfer learning.

READ FULL TEXT

page 3

page 8

research
05/02/2017

Transfer Learning by Ranking for Weakly Supervised Object Annotation

Most existing approaches to training object detectors rely on fully supe...
research
01/13/2022

Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

The challenging field of scene text detection requires complex data anno...
research
07/30/2019

Weakly Supervised Body Part Parsing with Pose based Part Priors

Human body part parsing refers to the task of predicting the semantic se...
research
11/19/2019

Weak Supervision for Generating Pixel-Level Annotations in Scene Text Segmentation

Providing pixel-level supervisions for scene text segmentation is inhere...
research
07/09/2021

Learning Cascaded Detection Tasks with Weakly-Supervised Domain Adaptation

In order to handle the challenges of autonomous driving, deep learning h...
research
03/20/2022

Towards 3D Scene Understanding by Referring Synthetic Models

Promising performance has been achieved for visual perception on the poi...
research
08/14/2018

Vendor-independent soft tissue lesion detection using weakly supervised and unsupervised adversarial domain adaptation

Computer-aided detection aims to improve breast cancer screening program...

Please sign up or login with your details

Forgot password? Click here to reset