CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation

11/01/2022
by   Abhilasha Ravichander, et al.
0

The full power of human language-based communication cannot be realized without negation. All human languages have some form of negation. Despite this, negation remains a challenging phenomenon for current natural language understanding systems. To facilitate the future development of models that can process negation effectively, we present CONDAQA, the first English reading comprehension dataset which requires reasoning about the implications of negated statements in paragraphs. We collect paragraphs with diverse negation cues, then have crowdworkers ask questions about the implications of the negated statement in the passage. We also have workers make three kinds of edits to the passage – paraphrasing the negated statement, changing the scope of the negation, and reversing the negation – resulting in clusters of question-answer pairs that are difficult for models to answer with spurious shortcuts. CONDAQA features 14,182 question-answer pairs with over 200 unique negation cues and is challenging for current state-of-the-art models. The best performing model on CONDAQA (UnifiedQA-v2-3b) achieves only 42 consistency metric, well below human performance which is 81 dataset, along with fully-finetuned, few-shot, and zero-shot evaluations, to facilitate the development of future NLP methods that work on negated language.

READ FULL TEXT

page 4

page 16

page 17

page 23

research
04/02/2020

R3: A Reading Comprehension Benchmark Requiring Reasoning Processes

Existing question answering systems can only predict answers without exp...
research
05/09/2017

TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

We present TriviaQA, a challenging reading comprehension dataset contain...
research
08/16/2019

Reasoning Over Paragraph Effects in Situations

A key component of successfully reading a passage of text is the ability...
research
03/12/2022

What Makes Reading Comprehension Questions Difficult?

For a natural language understanding benchmark to be useful in research,...
research
07/29/2021

Break, Perturb, Build: Automatic Perturbation of Reasoning Paths through Question Decomposition

Recent efforts to create challenge benchmarks that test the abilities of...
research
04/21/2018

DuoRC: Towards Complex Language Understanding with Paraphrased Reading Comprehension

We propose DuoRC, a novel dataset for Reading Comprehension (RC) that mo...
research
10/10/2019

RC-QED: Evaluating Natural Language Derivations in Multi-Hop Reading Comprehension

Recent studies revealed that reading comprehension (RC) systems learn to...

Please sign up or login with your details

Forgot password? Click here to reset