Who did What: A Large-Scale Person-Centered Cloze Dataset

08/19/2016
by   Takeshi Onishi, et al.
0

We have constructed a new "Who-did-What" dataset of over 200,000 fill-in-the-gap (cloze) multiple choice reading comprehension problems constructed from the LDC English Gigaword newswire corpus. The WDW dataset has a variety of novel features. First, in contrast with the CNN and Daily Mail datasets (Hermann et al., 2015) we avoid using article summaries for question formation. Instead, each problem is formed from two independent articles --- an article given as the passage to be read and a separate article on the same events used to form the question. Second, we avoid anonymization --- each choice is a person named entity. Third, the problems have been filtered to remove a fraction that are easily solved by simple baselines, while remaining 84 and propose the WDW dataset as a challenge task for the community.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/08/2018

Generating Distractors for Reading Comprehension Questions from Real Examinations

We investigate the task of distractor generation for multiple choice rea...
research
08/14/2018

How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks

Many recent papers address reading comprehension, where examples consist...
research
02/01/2021

Self-Teaching Machines to Read and Comprehend with Large-Scale Multi-Subject Question Answering Data

In spite of much recent research in the area, it is still unclear whethe...
research
11/20/2019

Co-Attention Hierarchical Network: Generating Coherent Long Distractors for Reading Comprehension

In reading comprehension, generating sentence-level distractors is a sig...
research
05/12/2022

NER-MQMRC: Formulating Named Entity Recognition as Multi Question Machine Reading Comprehension

NER has been traditionally formulated as a sequence labeling task. Howev...
research
12/30/2019

The Shmoop Corpus: A Dataset of Stories with Loosely Aligned Summaries

Understanding stories is a challenging reading comprehension problem for...
research
10/23/2020

Generating Adequate Distractors for Multiple-Choice Questions

This paper presents a novel approach to automatic generation of adequate...

Please sign up or login with your details

Forgot password? Click here to reset