Object-Aware Cropping for Self-Supervised Learning

by   Shlok Mishra, et al.

A core component of the recent success of self-supervised learning is cropping data augmentation, which selects sub-regions of an image to be used as positive views in the self-supervised loss. The underlying assumption is that randomly cropped and resized regions of a given image share information about the objects of interest, which the learned representation will capture. This assumption is mostly satisfied in datasets such as ImageNet where there is a large, centered object, which is highly likely to be present in random crops of the full image. However, in other datasets such as OpenImages or COCO, which are more representative of real world uncurated data, there are typically multiple small objects in an image. In this work, we show that self-supervised learning based on the usual random cropping performs poorly on such datasets. We propose replacing one or both of the random crops with crops obtained from an object proposal algorithm. This encourages the model to learn both object and scene level semantic representations. Using this approach, which we call object-aware cropping, results in significant improvements over scene cropping on classification and object detection benchmarks. For example, on OpenImages, our approach achieves an improvement of 8.8 cropping using MoCo-v2 based pre-training. We also show significant improvements on COCO and PASCAL-VOC object detection and segmentation tasks over the state-of-the-art self-supervised learning approaches. Our approach is efficient, simple and general, and can be used in most existing contrastive and non-contrastive self-supervised learning frameworks.


page 2

page 3

page 19

page 20

page 21


Self-supervised Learning with Local Contrastive Loss for Detection and Semantic Segmentation

We present a self-supervised learning (SSL) method suitable for semi-glo...

Self-Supervised Learning Through Efference Copies

Self-supervised learning (SSL) methods aim to exploit the abundance of u...

Self-Supervised Ranking for Representation Learning

We present a new framework for self-supervised representation learning b...

Object discovery and representation networks

The promise of self-supervised learning (SSL) is to leverage large amoun...

FLSL: Feature-level Self-supervised Learning

Current self-supervised learning (SSL) methods (e.g., SimCLR, DINO, VICR...

Self-supervised Training of Proposal-based Segmentation via Background Prediction

While supervised object detection methods achieve impressive accuracy, t...

Contrastive Learning for OOD in Object detection

Contrastive learning is commonly applied to self-supervised learning, an...

Please sign up or login with your details

Forgot password? Click here to reset