NewsQA: A Machine Comprehension Dataset

11/29/2016
by   Adam Trischler, et al.
0

We present NewsQA, a challenging machine comprehension dataset of over 100,000 human-generated question-answer pairs. Crowdworkers supply questions and answers based on a set of over 10,000 news articles from CNN, with answers consisting of spans of text from the corresponding articles. We collect this dataset through a four-stage process designed to solicit exploratory questions that require reasoning. A thorough analysis confirms that NewsQA demands abilities beyond simple word matching and recognizing textual entailment. We measure human performance on the dataset and compare it to several strong neural models. The performance gap between humans and machines (0.198 in F1) indicates that significant progress can be made on NewsQA through future research. The dataset is freely available at https://datasets.maluuba.com/NewsQA.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2016

SQuAD: 100,000+ Questions for Machine Comprehension of Text

We present the Stanford Question Answering Dataset (SQuAD), a new readin...
research
05/11/2016

Machine Comprehension Based on Learning to Rank

Machine comprehension plays an essential role in NLP and has been widely...
research
06/29/2017

Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension

We develop a technique for transfer learning in machine comprehension (M...
research
09/04/2018

RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes

Understanding and reasoning about cooking recipes is a fruitful research...
research
06/19/2020

New Vietnamese Corpus for Machine ReadingComprehension of Health News Articles

Although over 95 million people in the world speak the Vietnamese langua...
research
03/30/2018

The Training of Neuromodels for Machine Comprehension of Text. Brain2Text Algorithm

Nowadays, the Internet represents a vast informational space, growing ex...
research
05/16/2016

Joint Learning of Sentence Embeddings for Relevance and Entailment

We consider the problem of Recognizing Textual Entailment within an Info...

Please sign up or login with your details

Forgot password? Click here to reset