Pre-Training to Learn in Context

05/16/2023
by   Yuxian Gu, et al.
0

In-context learning, where pre-trained language models learn to perform tasks from task examples and instructions in their contexts, has attracted much attention in the NLP community. However, the ability of in-context learning is not fully exploited because language models are not explicitly trained to learn in context. To this end, we propose PICL (Pre-training for In-Context Learning), a framework to enhance the language models' in-context learning ability by pre-training the model on a large collection of "intrinsic tasks" in the general plain-text corpus using the simple language modeling objective. PICL encourages the model to infer and perform tasks by conditioning on the contexts while maintaining task generalization of pre-trained models. We evaluate the in-context learning performance of the model trained with PICL on seven widely-used text classification datasets and the Super-NaturalInstrctions benchmark, which contains 100+ NLP tasks formulated to text generation. Our experiments show that PICL is more effective and task-generalizable than a range of baselines, outperforming larger language models with nearly 4x parameters. The code is publicly available at https://github.com/thu-coai/PICL.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2020

iNLTK: Natural Language Toolkit for Indic Languages

We present iNLTK, an open-source NLP library consisting of pre-trained l...
research
09/10/2021

Investigating Numeracy Learning Ability of a Text-to-Text Transfer Model

The transformer-based pre-trained language models have been tremendously...
research
04/06/2022

Language Model for Text Analytic in Cybersecurity

NLP is a form of artificial intelligence and machine learning concerned ...
research
08/30/2022

Annotated Dataset Creation through General Purpose Language Models for non-English Medical NLP

Obtaining text datasets with semantic annotations is an effortful proces...
research
05/20/2023

Can NLP Models Correctly Reason Over Contexts that Break the Common Assumptions?

Pre-training on large corpora of text enables the language models to acq...
research
12/06/2021

Quantifying Adaptability in Pre-trained Language Models with 500 Tasks

When a neural language model (LM) is adapted to perform a new task, what...
research
07/06/2020

Learning Spoken Language Representations with Neural Lattice Language Modeling

Pre-trained language models have achieved huge improvement on many NLP t...

Please sign up or login with your details

Forgot password? Click here to reset