COGS: A Compositional Generalization Challenge Based on Semantic Interpretation

10/12/2020
by   Najoung Kim, et al.
0

Natural language is characterized by compositionality: the meaning of a complex expression is constructed from the meanings of its constituent parts. To facilitate the evaluation of the compositional abilities of language processing architectures, we introduce COGS, a semantic parsing dataset based on a fragment of English. The evaluation portion of COGS contains multiple systematic gaps that can only be addressed by compositional generalization; these include new combinations of familiar syntactic structures, or new combinations of familiar words and familiar structures. In experiments with Transformers and LSTMs, we found that in-distribution accuracy on the COGS test set was near-perfect (96–99 lower (16–35 findings indicate that contemporary standard NLP models are limited in their compositional generalization capacity, and position COGS as a good way to measure progress.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

04/22/2019

Compositional generalization in a deep seq2seq model by separating syntax and semantics

Standard methods in deep learning for natural language processing fail t...
10/24/2020

Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?

Sequence-to-sequence models excel at handling natural language variation...
07/19/2018

Rearranging the Familiar: Testing Compositional Generalization in Recurrent Networks

Systematic compositionality is the ability to recombine meaningful units...
10/12/2020

Improving Compositional Generalization in Semantic Parsing

Generalization of models to out-of-distribution (OOD) data has captured ...
04/27/2020

Word Interdependence Exposes How LSTMs Compose Representations

Recent work in NLP shows that LSTM language models capture compositional...
12/15/2020

*-CFQ: Analyzing the Scalability of Machine Learning on a Compositional Task

We present *-CFQ ("star-CFQ"): a suite of large-scale datasets of varyin...
06/12/2018

Evaluation of Unsupervised Compositional Representations

We evaluated various compositional models, from bag-of-words representat...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.