Aligning the Pretraining and Finetuning Objectives of Language Models

02/05/2020
by   Nuo Wang Pierse, et al.
0

We demonstrate that explicitly aligning the pretraining objectives to the finetuning objectives in language model training significantly improves the finetuning task performance and reduces the minimum amount of finetuning examples required. The performance margin gained from objective alignment allows us to build language models with smaller sizes for tasks with less available training data. We provide empirical evidence of these claims by applying objective alignment to concept-of-interest tagging and acronym detection tasks. We found that, with objective alignment, our 768 by 3 and 512 by 3 transformer language models can reach accuracy of 83.9 concept-of-interest tagging and 73.8 200 finetuning examples per task, outperforming the 768 by 3 model pretrained without objective alignment by +4.8 small language models in the presence of hundreds of training examples or less "Few Example learning". In practice, Few Example Learning enabled by objective alignment not only saves human labeling costs, but also makes it possible to leverage language models in more real-time applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/13/2022

METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals

We present an efficient method of pretraining large-scale autoencoding l...
research
02/16/2023

Pretraining Language Models with Human Preferences

Language models (LMs) are pretrained to imitate internet text, including...
research
05/12/2021

Improving Code Autocompletion with Transfer Learning

Software language models have achieved promising results predicting code...
research
06/15/2020

To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks

Pretraining NLP models with variants of Masked Language Model (MLM) obje...
research
11/03/2021

An Explanation of In-context Learning as Implicit Bayesian Inference

Large pretrained language models such as GPT-3 have the surprising abili...
research
10/06/2020

Does the Objective Matter? Comparing Training Objectives for Pronoun Resolution

Hard cases of pronoun resolution have been used as a long-standing bench...
research
10/09/2021

The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design

Pretraining Neural Language Models (NLMs) over a large corpus involves c...

Please sign up or login with your details

Forgot password? Click here to reset