Analyzing Compositionality-Sensitivity of NLI Models

11/16/2018
by   Yixin Nie, et al.
0

Success in natural language inference (NLI) should require a model to understand both lexical and compositional semantics. However, through adversarial evaluation, we find that several state-of-the-art models with diverse architectures are over-relying on the former and fail to use the latter. Further, this compositionality unawareness is not reflected via standard evaluation on current datasets. We show that removing RNNs in existing models or shuffling input words during training does not induce large performance loss despite the explicit removal of compositional information. Therefore, we propose a compositionality-sensitivity testing setup that analyzes models on natural examples from existing datasets that cannot be solved via lexical features alone (i.e., on which a bag-of-words model gives a high probability to one wrong label), hence revealing the models' actual compositionality awareness. We show that this setup not only highlights the limited compositional ability of current NLI models, but also differentiates model performance based on design, e.g., separating shallow bag-of-words models from deeper, linguistically-grounded tree-based models. Our evaluation setup is an important analysis tool: complementing currently existing adversarial and linguistically driven diagnostic evaluations, and exposing opportunities for future work on evaluating models' compositional understanding.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2018

Evaluation of Unsupervised Compositional Representations

We evaluated various compositional models, from bag-of-words representat...
research
10/24/2020

Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?

Sequence-to-sequence models excel at handling natural language variation...
research
10/04/2022

When and why vision-language models behave like bags-of-words, and what to do about it?

Despite the success of large vision and language models (VLMs) in many d...
research
04/07/2022

Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality

We present a novel task and dataset for evaluating the ability of vision...
research
06/24/2020

Compositional Explanations of Neurons

We describe a procedure for explaining neurons in deep representations b...
research
05/21/2019

CNNs found to jump around more skillfully than RNNs: Compositional generalization in seq2seq convolutional networks

Lake and Baroni (2018) introduced the SCAN dataset probing the ability o...
research
10/30/2018

Stress-Testing Neural Models of Natural Language Inference with Multiply-Quantified Sentences

Standard evaluations of deep learning models for semantics using natural...

Please sign up or login with your details

Forgot password? Click here to reset