Location-Aware Self-Supervised Transformers

12/05/2022
by   Mathilde Caron, et al.
0

Pixel-level labels are particularly expensive to acquire. Hence, pretraining is a critical step to improve models on a task like semantic segmentation. However, prominent algorithms for pretraining neural networks use image-level objectives, e.g. image classification, image-text alignment a la CLIP, or self-supervised contrastive learning. These objectives do not model spatial information, which might be suboptimal when finetuning on downstream tasks with spatial reasoning. In this work, we propose to pretrain networks for semantic segmentation by predicting the relative location of image parts. We formulate this task as a classification problem where each patch in a query view has to predict its position relatively to another reference view. We control the difficulty of the task by masking a subset of the reference patch features visible to those of the query. Our experiments show that this location-aware (LOCA) self-supervised pretraining leads to representations that transfer competitively to several challenging semantic segmentation benchmarks.

READ FULL TEXT

page 5

page 14

research
03/22/2022

CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation

Recent advances in self-supervised contrastive learning yield good image...
research
05/21/2023

From Patches to Objects: Exploiting Spatial Reasoning for Better Visual Representations

As the field of deep learning steadily transitions from the realm of aca...
research
06/06/2020

3D Self-Supervised Methods for Medical Imaging

Self-supervised learning methods have witnessed a recent surge of intere...
research
04/09/2021

Context-self contrastive pretraining for crop type semantic segmentation

In this paper we propose a fully-supervised pretraining scheme based on ...
research
02/07/2022

Context Autoencoder for Self-Supervised Representation Learning

We present a novel masked image modeling (MIM) approach, context autoenc...
research
06/07/2019

Selfie: Self-supervised Pretraining for Image Embedding

We introduce a pretraining technique called Selfie, which stands for SEL...
research
10/23/2022

Adversarial Pretraining of Self-Supervised Deep Networks: Past, Present and Future

In this paper, we review adversarial pretraining of self-supervised deep...

Please sign up or login with your details

Forgot password? Click here to reset