Sentence Extraction-Based Machine Reading Comprehension for Vietnamese

05/19/2021
by   Phong Nguyen-Thuan Do, et al.
0

The development of Vietnamese language processing in general and machine reading comprehension in particular has attracted the great attention of the research community. In recent years, there are a few datasets for machine reading comprehension tasks in Vietnamese with large sizes, such as UIT-ViQuAD and UIT-ViNewsQA. However, the datasets are not diverse in answer to serve the research. In this paper, we introduce the UIT-ViWikiQA, the first dataset for evaluating sentence extraction-based machine reading comprehension in the Vietnamese language. The UIT-ViWikiQA dataset is converted from the UIT-ViQuAD dataset, consisting of comprises 23.074 question-answers based on 5.109 passages of 174 Vietnamese articles from Wikipedia. We propose a conversion algorithm to create the dataset for sentence extraction-based machine reading comprehension and three types of approaches on the sentence extraction-based machine reading comprehension for Vietnamese. Our experiments show that the best machine model is XLM-R_Large, which achieves an exact match (EM) score of 85.97 experimental results in terms of the question type in Vietnamese and the effect of context on the performance of the MRC models, thereby showing the challenges from the UIT-ViWikiQA dataset that we propose to the natural language processing community.

READ FULL TEXT
research
09/16/2019

KorQuAD1.0: Korean QA Dataset for Machine Reading Comprehension

Machine Reading Comprehension (MRC) is a task that requires machine to u...
research
06/15/2017

S-Net: From Answer Extraction to Answer Generation for Machine Reading Comprehension

In this paper, we present a novel approach to machine reading comprehens...
research
09/10/2018

Exploring Machine Reading Comprehension with Explicit Knowledge

To apply general knowledge to machine reading comprehension (MRC), we pr...
research
04/04/2020

Prerequisites for Explainable Machine Reading Comprehension: A Position Paper

Machine reading comprehension (MRC) has received considerable attention ...
research
10/23/2019

Relation Module for Non-answerable Prediction on Question Answering

Machine reading comprehension(MRC) has attracted significant amounts of ...
research
08/10/2018

Hierarchical Attention: What Really Counts in Various NLP Tasks

Attention mechanisms in sequence to sequence models have shown great abi...
research
08/14/2021

A New Entity Extraction Method Based on Machine Reading Comprehension

Entity extraction is a key technology for obtaining information from mas...

Please sign up or login with your details

Forgot password? Click here to reset