SQuAD: 100,000+ Questions for Machine Comprehension of Text

06/16/2016
by   Pranav Rajpurkar, et al.
0

We present the Stanford Question Answering Dataset (SQuAD), a new reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage. We analyze the dataset to understand the types of reasoning required to answer the questions, leaning heavily on dependency and constituency trees. We build a strong logistic regression model, which achieves an F1 score of 51.0 improvement over a simple baseline (20 much higher, indicating that the dataset presents a good challenge problem for future research. The dataset is freely available at https://stanford-qa.com

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/14/2020

FQuAD: French Question Answering Dataset

Recent advances in the field of language modeling have improved state-of...
research
11/29/2016

NewsQA: A Machine Comprehension Dataset

We present NewsQA, a challenging machine comprehension dataset of over 1...
research
05/02/2020

ForecastQA: Machine Comprehension of Temporal Text for Answering Forecasting Questions

Textual data are often accompanied by time information (e.g., dates in n...
research
09/09/2019

Question Generation by Transformers

A machine learning model was developed to automatically generate questio...
research
09/27/2021

FQuAD2.0: French Question Answering and knowing that you know nothing

Question Answering, including Reading Comprehension, is one of the NLP r...
research
08/21/2018

QuAC : Question Answering in Context

We present QuAC, a dataset for Question Answering in Context that contai...
research
06/19/2020

New Vietnamese Corpus for Machine ReadingComprehension of Health News Articles

Although over 95 million people in the world speak the Vietnamese langua...

Please sign up or login with your details

Forgot password? Click here to reset