Multilingual Question Answering from Formatted Text applied to Conversational Agents

10/10/2019
by   Wissam Siblini, et al.
0

Recent advances in NLP with language models such as BERT, GPT-2, XLNet or XLM, have allowed surpassing human performance on Reading Comprehension tasks on large-scale datasets (e.g. SQuAD), and this opens up many perspectives for Conversational AI. However, task-specific datasets are mostly in English which makes it difficult to acknowledge progress in foreign languages. Fortunately, state-of-the-art models are now being pre-trained on multiple languages (e.g. BERT was released in a multilingual version managing a hundred languages) and are exhibiting ability for zero-shot transfer from English to others languages on XNLI. In this paper, we run experiments that show that multilingual BERT, trained to solve the complex Question Answering task defined in the English SQuAD dataset, is able to achieve the same task in Japanese and French. It even outperforms the best published results of a baseline which explicitly combines an English model for Reading Comprehension and a Machine Translation Model for transfer. We run further tests on crafted cross-lingual QA datasets (context in one language and question in another) to provide intuition on the mechanisms that allow BERT to transfer the task from one language to another. Finally, we introduce our application Kate. Kate is a conversational agent dedicated to HR support for employees that exploits multilingual models to accurately answer to questions, in several languages, directly from information web pages.

READ FULL TEXT
research
06/02/2020

BERT Based Multilingual Machine Comprehension in English and Hindi

Multilingual Machine Comprehension (MMC) is a Question-Answering (QA) su...
research
02/26/2023

Cross-Lingual Question Answering over Knowledge Base as Reading Comprehension

Although many large-scale knowledge bases (KBs) claim to contain multili...
research
05/24/2019

BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

In this paper we study yes/no questions that are naturally occurring ---...
research
04/29/2020

Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension

Multilingual pre-trained models could leverage the training data from a ...
research
09/15/2023

Are Multilingual LLMs Culturally-Diverse Reasoners? An Investigation into Multicultural Proverbs and Sayings

Large language models (LLMs) are highly adept at question answering and ...
research
07/05/2023

Multilingual Controllable Transformer-Based Lexical Simplification

Text is by far the most ubiquitous source of knowledge and information a...
research
04/15/2021

Are Multilingual BERT models robust? A Case Study on Adversarial Attacks for Multilingual Question Answering

Recent approaches have exploited weaknesses in monolingual question answ...

Please sign up or login with your details

Forgot password? Click here to reset