Correlational Image Modeling for Self-Supervised Visual Pre-Training

03/22/2023
by   Wei Li, et al.
0

We introduce Correlational Image Modeling (CIM), a novel and surprisingly effective approach to self-supervised visual pre-training. Our CIM performs a simple pretext task: we randomly crop image regions (exemplars) from an input image (context) and predict correlation maps between the exemplars and the context. Three key designs enable correlational image modeling as a nontrivial and meaningful self-supervisory task. First, to generate useful exemplar-context pairs, we consider cropping image regions with various scales, shapes, rotations, and transformations. Second, we employ a bootstrap learning framework that involves online and target encoders. During pre-training, the former takes exemplars as inputs while the latter converts the context. Third, we model the output correlation maps via a simple cross-attention block, within which the context serves as queries and the exemplars offer values and keys. We show that CIM performs on par or better than the current state of the art on self-supervised and transfer benchmarks.

READ FULL TEXT

page 8

page 15

research
02/07/2022

Corrupted Image Modeling for Self-Supervised Visual Pre-Training

We introduce Corrupted Image Modeling (CIM) for self-supervised visual p...
research
04/18/2020

Self-Supervised Representation Learning on Document Images

This work analyses the impact of self-supervised pre-training on documen...
research
04/25/2021

How Well Self-Supervised Pre-Training Performs with Streaming Data?

The common self-supervised pre-training practice requires collecting mas...
research
04/25/2023

LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization

We present a simple yet effective self-supervised pre-training method fo...
research
06/15/2022

Masked Frequency Modeling for Self-Supervised Visual Pre-Training

We present Masked Frequency Modeling (MFM), a unified frequency-domain-b...
research
05/08/2023

Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding

Masked signal modeling has greatly advanced self-supervised pre-training...
research
11/23/2022

Self-Supervised Learning based on Heat Equation

This paper presents a new perspective of self-supervised learning based ...

Please sign up or login with your details

Forgot password? Click here to reset