RGB-based Semantic Segmentation Using Self-Supervised Depth Pre-Training

02/06/2020
by   Jean Lahoud, et al.
4

Although well-known large-scale datasets, such as ImageNet, have driven image understanding forward, most of these datasets require extensive manual annotation and are thus not easily scalable. This limits the advancement of image understanding techniques. The impact of these large-scale datasets can be observed in almost every vision task and technique in the form of pre-training for initialization. In this work, we propose an easily scalable and self-supervised technique that can be used to pre-train any semantic RGB segmentation method. In particular, our pre-training approach makes use of automatically generated labels that can be obtained using depth sensors. These labels, denoted by HN-labels, represent different height and normal patches, which allow mining of local semantic information that is useful in the task of semantic RGB segmentation. We show how our proposed self-supervised pre-training with HN-labels can be used to replace ImageNet pre-training, while using 25x less images and without requiring any manual labeling. We pre-train a semantic segmentation network with our HN-labels, which resembles our final task more than pre-training on a less related task, e.g. classification with ImageNet. We evaluate on two datasets (NYUv2 and CamVid), and we show how the similarity in tasks is advantageous not only in speeding up the pre-training process, but also in achieving better final semantic segmentation accuracy than ImageNet pre-training

READ FULL TEXT

page 1

page 3

page 4

page 5

page 6

page 8

research
02/28/2023

Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors

Current popular backbones in computer vision, such as Vision Transformer...
research
03/26/2022

Does Monocular Depth Estimation Provide Better Pre-training than Classification for Semantic Segmentation?

Training a deep neural network for semantic segmentation is labor-intens...
research
10/04/2022

Self-supervised Pre-training for Semantic Segmentation in an Indoor Scene

The ability to endow maps of indoor scenes with semantic information is ...
research
12/02/2017

Mix-and-Match Tuning for Self-Supervised Semantic Segmentation

Deep convolutional networks for semantic image segmentation typically re...
research
02/14/2022

COLA: COarse LAbel pre-training for 3D semantic segmentation of sparse LiDAR datasets

Transfer learning is a proven technique in 2D computer vision to leverag...
research
06/05/2021

Points2Polygons: Context-Based Segmentation from Weak Labels Using Adversarial Networks

In applied image segmentation tasks, the ability to provide numerous and...

Please sign up or login with your details

Forgot password? Click here to reset