A Multilingual Modeling Method for Span-Extraction Reading Comprehension

05/31/2021
by   Gaochen Wu, et al.
0

Span-extraction reading comprehension models have made tremendous advances enabled by the availability of large-scale, high-quality training datasets. Despite such rapid progress and widespread application, extractive reading comprehension datasets in languages other than English remain scarce, and creating such a sufficient amount of training data for each language is costly and even impossible. An alternative to creating large-scale high-quality monolingual span-extraction training datasets is to develop multilingual modeling approaches and systems which can transfer to the target language without requiring training data in that language. In this paper, in order to solve the scarce availability of extractive reading comprehension training data in the target language, we propose a multilingual extractive reading comprehension approach called XLRC by simultaneously modeling the existing extractive reading comprehension training data in a multilingual environment using self-adaptive attention and multilingual attention. Specifically, we firstly construct multilingual parallel corpora by translating the existing extractive reading comprehension datasets (i.e., CMRC 2018) from the target language (i.e., Chinese) into different language families (i.e., English). Secondly, to enhance the final target representation, we adopt self-adaptive attention (SAA) to combine self-attention and inter-attention to extract the semantic relations from each pair of the target and source languages. Furthermore, we propose multilingual attention (MLA) to learn the rich knowledge from various language families. Experimental results show that our model outperforms the state-of-the-art baseline (i.e., RoBERTa_Large) on the CMRC 2018 task, which demonstrate the effectiveness of our proposed multi-lingual modeling approach and show the potentials in multilingual NLP tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2021

Improving Low-resource Reading Comprehension via Cross-lingual Transposition Rethinking

Extractive Reading Comprehension (ERC) has made tremendous advances enab...
research
04/29/2020

Bilingual Text Extraction as Reading Comprehension

In this paper, we propose a method to extract bilingual texts automatica...
research
08/30/2022

Large-scale Multi-granular Concept Extraction Based on Machine Reading Comprehension

The concepts in knowledge graphs (KGs) enable machines to understand nat...
research
09/10/2018

Multilingual Extractive Reading Comprehension by Runtime Machine Translation

Existing end-to-end neural network models for extractive Reading Compreh...
research
10/07/2020

Improving Context Modeling in Neural Topic Segmentation

Topic segmentation is critical in key NLP tasks and recent works favor h...
research
08/26/2021

Understanding Attention in Machine Reading Comprehension

Achieving human-level performance on some of Machine Reading Comprehensi...
research
05/15/2019

Learning Open Information Extraction of Implicit Relations from Reading Comprehension Datasets

The relationship between two entities in a sentence is often implied by ...

Please sign up or login with your details

Forgot password? Click here to reset