The Role of Pre-training Data in Transfer Learning

02/27/2023
by   Rahim Entezari, et al.
0

The transfer learning paradigm of model pre-training and subsequent fine-tuning produces high-accuracy models. While most studies recommend scaling the pre-training size to benefit most from transfer learning, a question remains: what data and method should be used for pre-training? We investigate the impact of pre-training data distribution on the few-shot and full fine-tuning performance using 3 pre-training methods (supervised, contrastive language-image and image-image), 7 pre-training datasets, and 9 downstream datasets. Through extensive controlled experiments, we find that the choice of the pre-training data source is essential for the few-shot transfer, but its role decreases as more data is made available for fine-tuning. Additionally, we explore the role of data curation and examine the trade-offs between label noise and the size of the pre-training dataset. We find that using 2000X more pre-training data from LAION can match the performance of supervised ImageNet pre-training. Furthermore, we investigate the effect of pre-training methods, comparing language-image contrastive vs. image-image contrastive, and find that the latter leads to better downstream accuracy

READ FULL TEXT

page 18

page 23

page 24

page 25

page 26

page 27

page 28

research
05/16/2022

FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization

We present FactPEGASUS, an abstractive summarization model that addresse...
research
08/03/2023

Curricular Transfer Learning for Sentence Encoded Tasks

Fine-tuning language models in a downstream task is the standard approac...
research
12/23/2022

Principled and Efficient Transfer Learning of Deep Models via Neural Collapse

With the ever-growing model size and the limited availability of labeled...
research
09/11/2023

Examining the Effect of Pre-training on Time Series Classification

Although the pre-training followed by fine-tuning paradigm is used exten...
research
11/16/2018

Domain Adaptive Transfer Learning with Specialist Models

Transfer learning is a widely used method to build high performing compu...
research
04/19/2021

A Framework using Contrastive Learning for Classification with Noisy Labels

We propose a framework using contrastive learning as a pre-training task...
research
02/02/2021

Scaling Laws for Transfer

We study empirical scaling laws for transfer learning between distributi...

Please sign up or login with your details

Forgot password? Click here to reset