ANLIzing the Adversarial Natural Language Inference Dataset

10/24/2020
by   Adina Williams, et al.
0

We perform an in-depth error analysis of Adversarial NLI (ANLI), a recently introduced large-scale human-and-model-in-the-loop natural language inference dataset collected over multiple rounds. We propose a fine-grained annotation scheme of the different aspects of inference that are responsible for the gold classification labels, and use it to hand-code all three of the ANLI development sets. We use these annotations to answer a variety of interesting questions: which inference types are most common, which models have the highest performance on each reasoning type, and which types are the most challenging for state of-the-art models? We hope that our annotations will enable more fine-grained evaluation of models trained on ANLI, provide us with a deeper understanding of where models fail and succeed, and help us determine how to train better models in future.

READ FULL TEXT
12/07/2019

Adversarial Analysis of Natural Language Inference Systems

The release of large natural language inference (NLI) datasets like SNLI...
12/31/2020

Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection

We present a first-of-its-kind large synthetic training dataset for onli...
10/20/2020

Natural Language Inference with Mixed Effects

There is growing evidence that the prevalence of disagreement in the raw...
10/19/2018

Improving Annotation for 3D Pose Dataset of Fine-Grained Object Categories

Existing 3D pose datasets of object categories are limited to generic ob...
10/31/2019

Adversarial NLI: A New Benchmark for Natural Language Understanding

We introduce a new large-scale NLI benchmark dataset, collected via an i...
10/07/2020

What Can We Learn from Collective Human Opinions on Natural Language Inference Data?

Despite the subjective nature of many NLP tasks, most NLU evaluations ha...