Unsupervised Object-Level Representation Learning from Scene Images

06/22/2021
by   Jiahao Xie, et al.
0

Contrastive self-supervised learning has largely narrowed the gap to supervised pre-training on ImageNet. However, its success highly relies on the object-centric priors of ImageNet, i.e., different augmented views of the same image correspond to the same object. Such a heavily curated constraint becomes immediately infeasible when pre-trained on more complex scene images with many objects. To overcome this limitation, we introduce Object-level Representation Learning (ORL), a new self-supervised learning framework towards scene images. Our key insight is to leverage image-level self-supervised pre-training as the prior to discover object-level semantic correspondence, thus realizing object-level representation learning from scene images. Extensive experiments on COCO show that ORL significantly improves the performance of self-supervised learning on scene images, even surpassing supervised ImageNet pre-training on several downstream tasks. Furthermore, ORL improves the downstream performance when more unlabeled scene images are available, demonstrating its great potential of harnessing unlabeled data in the wild. We hope our approach can motivate future research on more general-purpose unsupervised representation learning from scene data. Project page: https://www.mmlab-ntu.com/project/orl/.

READ FULL TEXT

page 2

page 9

page 10

page 11

research
03/14/2022

UniVIP: A Unified Framework for Self-Supervised Visual Pre-training

Self-supervised learning (SSL) holds promise in leveraging large amounts...
research
02/22/2023

Saliency Guided Contrastive Learning on Scene Images

Self-supervised learning holds promise in leveraging large numbers of un...
research
07/30/2021

Object-aware Contrastive Learning for Debiased Scene Representation

Contrastive self-supervised learning has shown impressive results in lea...
research
05/17/2021

Divide and Contrast: Self-supervised Learning from Uncurated Data

Self-supervised learning holds promise in leveraging large amounts of un...
research
01/14/2022

Boundary-aware Self-supervised Learning for Video Scene Segmentation

Self-supervised learning has drawn attention through its effectiveness i...
research
01/02/2023

ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

Driven by improved architectures and better representation learning fram...
research
03/16/2023

Self-Supervised Visual Representation Learning on Food Images

Food image analysis is the groundwork for image-based dietary assessment...

Please sign up or login with your details

Forgot password? Click here to reset