A surprisingly simple technique to control the pretraining bias for better transfer: Expand or Narrow your representation

04/11/2023
by   Florian Bordes, et al.
0

Self-Supervised Learning (SSL) models rely on a pretext task to learn representations. Because this pretext task differs from the downstream tasks used to evaluate the performance of these models, there is an inherent misalignment or pretraining bias. A commonly used trick in SSL, shown to make deep networks more robust to such bias, is the addition of a small projector (usually a 2 or 3 layer multi-layer perceptron) on top of a backbone network during training. In contrast to previous work that studied the impact of the projector architecture, we here focus on a simpler, yet overlooked lever to control the information in the backbone representation. We show that merely changing its dimensionality – by changing only the size of the backbone's very last block – is a remarkably effective technique to mitigate the pretraining bias. It significantly improves downstream transfer performance for both Self-Supervised and Supervised pretrained models.

READ FULL TEXT

page 6

page 8

page 11

research
01/24/2023

SMART: Self-supervised Multi-task pretrAining with contRol Transformers

Self-supervised pretraining has been extensively studied in language and...
research
08/12/2021

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Transformer-based pretrained language models (T-PTLMs) have achieved gre...
research
04/27/2022

Forecasting Urban Development from Satellite Images

Forecasting where and when new buildings will emerge is a rather unexplo...
research
11/26/2021

Self-supervised Pretraining with Classification Labels for Temporal Activity Detection

Temporal Activity Detection aims to predict activity classes per frame, ...
research
04/19/2022

Diverse Imagenet Models Transfer Better

A commonly accepted hypothesis is that models with higher accuracy on Im...
research
07/28/2023

SimDETR: Simplifying self-supervised pretraining for DETR

DETR-based object detectors have achieved remarkable performance but are...
research
10/22/2020

Contrastive Self-Supervised Learning for Wireless Power Control

We propose a new approach for power control in wireless networks using s...

Please sign up or login with your details

Forgot password? Click here to reset