Uncontrolled Lexical Exposure Leads to Overestimation of Compositional Generalization in Pretrained Models

12/21/2022
by   Najoung Kim, et al.
0

Human linguistic capacity is often characterized by compositionality and the generalization it enables – human learners can produce and comprehend novel complex expressions by composing known parts. Several benchmarks exploit distributional control across training and test to gauge compositional generalization, where certain lexical items only occur in limited contexts during training. While recent work using these benchmarks suggests that pretrained models achieve impressive generalization performance, we argue that exposure to pretraining data may break the aforementioned distributional control. Using the COGS benchmark of Kim and Linzen (2020), we test two modified evaluation setups that control for this issue: (1) substituting context-controlled lexical items with novel character sequences, and (2) substituting them with special tokens represented by novel embeddings. We find that both of these setups lead to lower generalization performance in T5 (Raffel et al., 2020), suggesting that previously reported results have been overestimated due to uncontrolled lexical exposure during pretraining. The performance degradation is more extreme with novel embeddings, and the degradation increases with the amount of pretraining data, highlighting an interesting case of inverse scaling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2022

Layer or Representation Space: What makes BERT-based Evaluation Metrics Robust?

The evaluation of recent embedding-based evaluation metrics for text gen...
research
02/24/2022

Compositional Generalization Requires Compositional Parsers

A rapidly growing body of research on compositional generalization inves...
research
07/14/2020

Our Evaluation Metric Needs an Update to Encourage Generalization

Models that surpass human performance on several popular benchmarks disp...
research
09/30/2021

Compositional generalization in semantic parsing with pretrained transformers

Large-scale pretraining instills large amounts of knowledge in deep neur...
research
01/22/2020

Contextualized Embeddings in Named-Entity Recognition: An Empirical Study on Generalization

Contextualized embeddings use unsupervised language model pretraining to...
research
08/13/2019

BioFLAIR: Pretrained Pooled Contextualized Embeddings for Biomedical Sequence Labeling Tasks

Biomedical Named Entity Recognition (NER) is a challenging problem in bi...
research
09/13/2023

Pretraining on the Test Set Is All You Need

Inspired by recent work demonstrating the promise of smaller Transformer...

Please sign up or login with your details

Forgot password? Click here to reset