BERT Based Multilingual Machine Comprehension in English and Hindi

06/02/2020
by   Somil Gupta, et al.
0

Multilingual Machine Comprehension (MMC) is a Question-Answering (QA) sub-task that involves quoting the answer for a question from a given snippet, where the question and the snippet can be in different languages. Recently released multilingual variant of BERT (m-BERT), pre-trained with 104 languages, has performed well in both zero-shot and fine-tuned settings for multilingual tasks; however, it has not been used for English-Hindi MMC yet. We, therefore, present in this article, our experiments with m-BERT for MMC in zero-shot, mono-lingual (e.g. Hindi Question-Hindi Snippet) and cross-lingual (e.g. English QuestionHindi Snippet) fine-tune setups. These model variants are evaluated on all possible multilingual settings and results are compared against the current state-of-the-art sequential QA system for these languages. Experiments show that m-BERT, with fine-tuning, improves performance on all evaluation settings across both the datasets used by the prior model, therefore establishing m-BERT based MMC as the new state-of-the-art for English and Hindi. We also publish our results on an extended version of the recently released XQuAD dataset, which we propose to use as the evaluation benchmark for future research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/10/2019

Multilingual Question Answering from Formatted Text applied to Conversational Agents

Recent advances in NLP with language models such as BERT, GPT-2, XLNet o...
research
08/13/2020

ANDES at SemEval-2020 Task 12: A jointly-trained BERT multilingual model for offensive language detection

This paper describes our participation in SemEval-2020 Task 12: Multilin...
research
12/11/2019

Automatic Spanish Translation of the SQuAD Dataset for Multilingual Question Answering

Recently, multilingual question answering became a crucial research topi...
research
04/15/2021

Are Multilingual BERT models robust? A Case Study on Adversarial Attacks for Multilingual Question Answering

Recent approaches have exploited weaknesses in monolingual question answ...
research
10/23/2020

Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering

Coupled with the availability of large scale datasets, deep learning arc...
research
08/07/2021

Multilingual Compositional Wikidata Questions

Semantic parsing allows humans to leverage vast knowledge resources thro...
research
02/15/2022

Delving Deeper into Cross-lingual Visual Question Answering

Visual question answering (VQA) is one of the crucial vision-and-languag...

Please sign up or login with your details

Forgot password? Click here to reset