Insights into Pre-training via Simpler Synthetic Tasks

06/21/2022
by   Yuhuai Wu, et al.
0

Pre-training produces representations that are effective for a wide range of downstream tasks, but it is still unclear what properties of pre-training are necessary for effective gains. Notably, recent work shows that even pre-training on synthetic tasks can achieve significant gains in downstream tasks. In this work, we perform three experiments that iteratively simplify pre-training and show that the simplifications still retain much of its gains. First, building on prior work, we perform a systematic evaluation of three existing synthetic pre-training methods on six downstream tasks. We find the best synthetic pre-training method, LIME, attains an average of 67% of the benefits of natural pre-training. Second, to our surprise, we find that pre-training on a simple and generic synthetic task defined by the Set function achieves 65% of the benefits, almost matching LIME. Third, we find that 39% of the benefits can be attained by using merely the parameter statistics of synthetic pre-training. We release the source code at https://github.com/felixzli/synthetic_pretraining.

READ FULL TEXT

page 6

page 15

page 16

page 18

page 20

page 21

page 28

page 29

research
08/07/2022

How Adversarial Robustness Transfers from Pre-training to Downstream Tasks

Given the rise of large-scale training regimes, adapting pre-trained mod...
research
03/10/2022

Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability

Large-scale pre-training has been proven to be crucial for various compu...
research
11/30/2021

Task2Sim : Towards Effective Pre-training and Transfer from Synthetic Data

Pre-training models on Imagenet or other massive datasets of real images...
research
10/25/2022

Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models

Language modeling on large-scale datasets leads to impressive performanc...
research
06/20/2023

SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling

Pre-training robot policies with a rich set of skills can substantially ...
research
11/14/2019

A Scalable Approach for Facial Action Unit Classifier Training UsingNoisy Data for Pre-Training

Machine learning systems are being used to automate many types of labori...
research
04/04/2023

Evaluating Synthetic Pre-Training for Handwriting Processing Tasks

In this work, we explore massive pre-training on synthetic word images f...

Please sign up or login with your details

Forgot password? Click here to reset