A Span-Extraction Dataset for Chinese Machine Reading Comprehension

10/17/2018
by   Yiming Cui, et al.
0

Machine Reading Comprehension (MRC) has become enormously popular recently and has attracted a lot of attention. However, the existing reading comprehension datasets are mostly in English. In this paper, we introduce a Span-Extraction dataset for Chinese Machine Reading Comprehension to add language diversities in this area. The dataset is composed by near 20,000 real questions annotated by human on Wikipedia paragraphs. We also annotated a challenge set which contains the questions that need comprehensive understanding and multi-sentence inference throughout the context. With the release of the dataset, we hosted the Second Evaluation Workshop on Chinese Machine Reading Comprehension (CMRC 2018). We hope the release of the dataset could further accelerate the machine reading comprehension research in Chinese language. The data is available through: https://github.com/ymcui/cmrc2018

READ FULL TEXT
research
09/25/2017

Dataset for the First Evaluation on Chinese Machine Reading Comprehension

Machine Reading Comprehension (MRC) has become enormously popular recent...
research
04/07/2020

A Sentence Cloze Dataset for Chinese Machine Reading Comprehension

Owing to the continuous contributions by the Chinese NLP community, more...
research
06/04/2018

DRCD: a Chinese Machine Reading Comprehension Dataset

In this paper, we introduce DRCD (Delta Reading Comprehension Dataset), ...
research
02/01/2019

DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension

We present DREAM, the first dialogue-based multiple-choice reading compr...
research
04/15/2017

RACE: Large-scale ReAding Comprehension Dataset From Examinations

We present RACE, a new dataset for benchmark evaluation of methods in th...
research
09/29/2019

Tag-based Multi-Span Extraction in Reading Comprehension

With models reaching human performance on many popular reading comprehen...
research
04/23/2020

DuReaderrobust: A Chinese Dataset Towards Evaluating the Robustness of Machine Reading Comprehension Models

Machine Reading Comprehension (MRC) is a crucial and challenging task in...

Please sign up or login with your details

Forgot password? Click here to reset