PQuAD: A Persian Question Answering Dataset

02/13/2022
by   Kasra Darvishi, et al.
0

We present Persian Question Answering Dataset (PQuAD), a crowdsourced reading comprehension dataset on Persian Wikipedia articles. It includes 80,000 questions along with their answers, with 25 adversarially unanswerable. We examine various properties of the dataset to show the diversity and the level of its difficulty as an MRC benchmark. By releasing this dataset, we aim to ease research on Persian reading comprehension and development of Persian question answering systems. Our experiments on different state-of-the-art pre-trained contextualized language models show 74.8 baseline results for further research on Persian QA.

READ FULL TEXT
research
09/16/2019

KorQuAD1.0: Korean QA Dataset for Machine Reading Comprehension

Machine Reading Comprehension (MRC) is a task that requires machine to u...
research
05/03/2023

NorQuAD: Norwegian Question Answering Dataset

In this paper we present NorQuAD: the first Norwegian question answering...
research
06/12/2019

Neural Arabic Question Answering

This paper tackles the problem of open domain factual Arabic question an...
research
11/28/2016

MS MARCO: A Human Generated MAchine Reading COmprehension Dataset

This paper presents our recent work on the design and development of a n...
research
08/31/2019

QAInfomax: Learning Robust Question Answering System by Mutual Information Maximization

Standard accuracy metrics indicate that modern reading comprehension sys...
research
07/25/2017

Question Dependent Recurrent Entity Network for Question Answering

Question Answering is a task which requires building models capable of p...
research
08/21/2018

QuAC : Question Answering in Context

We present QuAC, a dataset for Question Answering in Context that contai...

Please sign up or login with your details

Forgot password? Click here to reset