PerCQA: Persian Community Question Answering Dataset

12/25/2021
by   Naghme Jamali, et al.
0

Community Question Answering (CQA) forums provide answers for many real-life questions. Thanks to the large size, these forums are very popular among machine learning researchers. Automatic answer selection, answer ranking, question retrieval, expert finding, and fact-checking are example learning tasks performed using CQA data. In this paper, we present PerCQA, the first Persian dataset for CQA. This dataset contains the questions and answers crawled from the most well-known Persian forum. After data acquisition, we provide rigorous annotation guidelines in an iterative process, and then the annotation of question-answer pairs in SemEvalCQA format. PerCQA contains 989 questions and 21,915 annotated answers. We make PerCQA publicly available to encourage more research in Persian CQA. We also build strong benchmarks for the task of answer selection in PerCQA by using mono- and multi-lingual pre-trained language models

READ FULL TEXT
research
01/10/2018

MilkQA: a Dataset of Consumer Questions for the Task of Answer Selection

We introduce MilkQA, a question answering dataset from the dairy domain ...
research
06/27/2016

SelQA: A New Benchmark for Selection-based Question Answering

This paper presents a new selection-based question answering dataset, Se...
research
08/23/2019

Toward Dialogue Modeling: A Semantic Annotation Scheme for Questions and Answers

The present study proposes an annotation scheme for classifying the cont...
research
05/24/2021

VANiLLa : Verbalized Answers in Natural Language at Large Scale

In the last years, there have been significant developments in the area ...
research
02/05/2021

Think you have Solved Direct-Answer Question Answering? Try ARC-DA, the Direct-Answer AI2 Reasoning Challenge

We present the ARC-DA dataset, a direct-answer ("open response", "freefo...
research
07/22/2019

ELI5: Long Form Question Answering

We introduce the first large-scale corpus for long-form question answeri...
research
05/01/2022

ELQA: A Corpus of Questions and Answers about the English Language

We introduce a community-sourced dataset for English Language Question A...

Please sign up or login with your details

Forgot password? Click here to reset