UniVIP: A Unified Framework for Self-Supervised Visual Pre-training

03/14/2022
by   Zhaowen Li, et al.
0

Self-supervised learning (SSL) holds promise in leveraging large amounts of unlabeled data. However, the success of popular SSL methods has limited on single-centric-object images like those in ImageNet and ignores the correlation among the scene and instances, as well as the semantic difference of instances in the scene. To address the above problems, we propose a Unified Self-supervised Visual Pre-training (UniVIP), a novel self-supervised framework to learn versatile visual representations on either single-centric-object or non-iconic dataset. The framework takes into account the representation learning at three levels: 1) the similarity of scene-scene, 2) the correlation of scene-instance, 3) the discrimination of instance-instance. During the learning, we adopt the optimal transport algorithm to automatically measure the discrimination of instances. Massive experiments show that UniVIP pre-trained on non-iconic COCO achieves state-of-the-art transfer performance on a variety of downstream tasks, such as image classification, semi-supervised learning, object detection and segmentation. Furthermore, our method can also exploit single-centric-object dataset such as ImageNet and outperforms BYOL by 2.5 with the same pre-training epochs in linear probing, and surpass current self-supervised object detection methods on COCO dataset, demonstrating its universality and potential.

READ FULL TEXT

page 1

page 3

research
06/22/2021

Unsupervised Object-Level Representation Learning from Scene Images

Contrastive self-supervised learning has largely narrowed the gap to sup...
research
11/27/2020

Self-EMD: Self-Supervised Object Detection without ImageNet

In this paper, we propose a novel self-supervised representation learnin...
research
03/16/2022

Object discovery and representation networks

The promise of self-supervised learning (SSL) is to leverage large amoun...
research
07/27/2022

On the robustness of self-supervised representations for multi-view object classification

It is known that representations from self-supervised pre-training can p...
research
08/27/2021

MultiSiam: Self-supervised Multi-instance Siamese Representation Learning for Autonomous Driving

Autonomous driving has attracted much attention over the years but turns...
research
06/07/2023

Coarse Is Better? A New Pipeline Towards Self-Supervised Learning with Uncurated Images

Most self-supervised learning (SSL) methods often work on curated datase...
research
05/02/2015

Multi-Object Classification and Unsupervised Scene Understanding Using Deep Learning Features and Latent Tree Probabilistic Models

Deep learning has shown state-of-art classification performance on datas...

Please sign up or login with your details

Forgot password? Click here to reset