Self-Supervised Pretraining Improves Self-Supervised Pretraining

03/23/2021
by   Colorado J. Reed, et al.
45

While self-supervised pretraining has proven beneficial for many computer vision tasks, it requires expensive and lengthy computation, large amounts of data, and is sensitive to data augmentation. Prior work demonstrates that models pretrained on datasets dissimilar to their target data, such as chest X-ray models trained on ImageNet, underperform models trained from scratch. Users that lack the resources to pretrain must use existing models with lower performance. This paper explores Hierarchical PreTraining (HPT), which decreases convergence time and improves accuracy by initializing the pretraining process with an existing pretrained model. Through experimentation on 16 diverse vision datasets, we show HPT converges up to 80x faster, improves accuracy across tasks, and improves the robustness of the self-supervised pretraining process to changes in the image augmentation policy or amount of pretraining data. Taken together, HPT provides a simple framework for obtaining better pretrained representations with less computational resources.

READ FULL TEXT

page 5

page 6

page 16

research
11/23/2022

Can we Adopt Self-supervised Pretraining for Chest X-Rays?

Chest radiograph (or Chest X-Ray, CXR) is a popular medical imaging moda...
research
02/02/2023

Energy-Inspired Self-Supervised Pretraining for Vision Models

Motivated by the fact that forward and backward passes of a deep network...
research
10/26/2021

TUNet: A Block-online Bandwidth Extension Model based on Transformers and Self-supervised Pretraining

We introduce a block-online variant of the temporal feature-wise linear ...
research
04/14/2023

DINOv2: Learning Robust Visual Features without Supervision

The recent breakthroughs in natural language processing for model pretra...
research
08/16/2023

Is Self-Supervised Pretraining Good for Extrapolation in Molecular Property Prediction?

The prediction of material properties plays a crucial role in the develo...
research
12/02/2022

ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning

Pretraining has been shown to scale well with compute, data size and dat...
research
08/11/2022

MILAN: Masked Image Pretraining on Language Assisted Representation

Self-attention based transformer models have been dominating many comput...

Please sign up or login with your details

Forgot password? Click here to reset