Learning Disentangled Semantic Representations for Zero-Shot Cross-Lingual Transfer in Multilingual Machine Reading Comprehension

by   injuan Wu, et al.

Multilingual pre-trained models are able to zero-shot transfer knowledge from rich-resource to low-resource languages in machine reading comprehension (MRC). However, inherent linguistic discrepancies in different languages could make answer spans predicted by zero-shot transfer violate syntactic constraints of the target language. In this paper, we propose a novel multilingual MRC framework equipped with a Siamese Semantic Disentanglement Model (SSDM) to disassociate semantics from syntax in representations learned by multilingual pre-trained models. To explicitly transfer only semantic knowledge to the target language, we propose two groups of losses tailored for semantic and syntactic encoding and disentanglement. Experimental results on three multilingual MRC datasets (i.e., XQuAD, MLQA, and TyDi QA) demonstrate the effectiveness of our proposed approach over models based on mBERT and XLM-100. Code is available at:https://github.com/wulinjuan/SSDM_MRC.


page 1

page 2

page 3

page 4


Bilingual Alignment Pre-training for Zero-shot Cross-lingual Transfer

Multilingual pre-trained models have achieved remarkable transfer perfor...

Semantic Drift in Multilingual Representations

Multilingual representations have mostly been evaluated based on their p...

Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph

We target the task of cross-lingual Machine Reading Comprehension (MRC) ...

Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension

Multilingual pre-trained models could leverage the training data from a ...

Multilingual Synthetic Question and Answer Generation for Cross-Lingual Reading Comprehension

We propose a simple method to generate large amounts of multilingual que...

ByT5 model for massively multilingual grapheme-to-phoneme conversion

In this study, we tackle massively multilingual grapheme-to-phoneme conv...

An Exploratory Analysis of Multilingual Word-Level Quality Estimation with Cross-Lingual Transformers

Most studies on word-level Quality Estimation (QE) of machine translatio...