How Can We Accelerate Progress Towards Human-like Linguistic Generalization?

05/03/2020
by   Tal Linzen, et al.
0

This position paper describes and critiques the Pretraining-Agnostic Identically Distributed (PAID) evaluation paradigm, which has become a central tool for measuring progress in natural language understanding. This paradigm consists of three stages: (1) pre-training of a word prediction model on a corpus of arbitrary size; (2) fine-tuning (transfer learning) on a training set representing a classification task; (3) evaluation on a test set drawn from the same distribution as that training set. This paradigm favors simple, low-bias architectures, which, first, can be scaled to process vast amounts of data, and second, can capture the fine-grained statistical properties of a particular data set, regardless of whether those properties are likely to generalize to examples of the task outside the data set. This contrasts with humans, who learn language from several orders of magnitude less data than the systems favored by this evaluation paradigm, and generalize to new tasks in a consistent way. We advocate for supplementing or replacing PAID with paradigms that reward architectures that generalize as quickly and robustly as humans.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/21/2019

Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets

Crowdsourcing has been the prevalent paradigm for creating natural langu...
research
01/31/2019

Learning and Evaluating General Linguistic Intelligence

We define general linguistic intelligence as the ability to reuse previo...
research
12/27/2020

MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining

One of the biggest challenges that prohibit the use of many current NLP ...
research
04/10/2020

Training few-shot classification via the perspective of minibatch and pretraining

Few-shot classification is a challenging task which aims to formulate th...
research
05/03/2018

The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models

Seq2Seq based neural architectures have become the go-to architecture to...
research
10/11/2020

Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually)

One reason pretraining on self-supervised linguistic tasks is effective ...
research
03/21/2020

Prob2Vec: Mathematical Semantic Embedding for Problem Retrieval in Adaptive Tutoring

We propose a new application of embedding techniques for problem retriev...

Please sign up or login with your details

Forgot password? Click here to reset