Stress Test Evaluation for Natural Language Inference

06/02/2018
by   Aakanksha Naik, et al.
0

Natural language inference (NLI) is the task of determining if a natural language hypothesis can be inferred from a given premise in a justifiable manner. NLI was proposed as a benchmark task for natural language understanding. Existing models perform well at standard datasets for NLI, achieving impressive results across different genres of text. However, the extent to which these models understand the semantic content of sentences is unclear. In this work, we propose an evaluation methodology consisting of automatically constructed "stress tests" that allow us to examine whether systems have the ability to make real inferential decisions. Our evaluation of six sentence-encoder models on these stress tests reveals strengths and weaknesses of these models with respect to challenging linguistic phenomena, and suggests important directions for future work in this area.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/06/2018

Annotation Artifacts in Natural Language Inference Data

Large-scale datasets for natural language inference are created by prese...
research
10/20/2020

ConjNLI: Natural Language Inference Over Conjunctive Sentences

Reasoning about conjuncts in conjunctive sentences is important for a de...
research
05/29/2020

Beyond Leaderboards: A survey of methods for revealing weaknesses in Natural Language Inference data and models

Recent years have seen a growing number of publications that analyse Nat...
research
05/11/2018

Behavior Analysis of NLI Models: Uncovering the Influence of Three Factors on Robustness

Natural Language Inference is a challenging task that has received subst...
research
01/14/2021

SICKNL: A Dataset for Dutch Natural Language Inference

We present SICK-NL (read: signal), a dataset targeting Natural Language ...
research
10/30/2018

Stress-Testing Neural Models of Natural Language Inference with Multiply-Quantified Sentences

Standard evaluations of deep learning models for semantics using natural...
research
02/09/2021

Statistically Profiling Biases in Natural Language Reasoning Datasets and Models

Recent work has indicated that many natural language understanding and r...

Please sign up or login with your details

Forgot password? Click here to reset