Investigating Post-pretraining Representation Alignment for Cross-Lingual Question Answering

09/24/2021
by   Fahim Faisal, et al.
0

Human knowledge is collectively encoded in the roughly 6500 languages spoken around the world, but it is not distributed equally across languages. Hence, for information-seeking question answering (QA) systems to adequately serve speakers of all languages, they need to operate cross-lingually. In this work we investigate the capabilities of multilingually pre-trained language models on cross-lingual QA. We find that explicitly aligning the representations across languages with a post-hoc fine-tuning step generally leads to improved performance. We additionally investigate the effect of data size as well as the language choice in this fine-tuning step, also releasing a dataset for evaluating cross-lingual QA systems. Code and dataset are publicly available here: https://github.com/ffaisal93/aligned_qa

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/12/2022

MuCoT: Multilingual Contrastive Training for Question-Answering in Low-resource Languages

Accuracy of English-language Question Answering (QA) systems has improve...
research
11/05/2020

EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question Answering

We propose EXAMS – a new benchmark dataset for cross-lingual and multili...
research
05/28/2021

Towards More Equitable Question Answering Systems: How Much More Data Do You Need?

Question answering (QA) in English has been widely explored, but multili...
research
09/24/2021

SD-QA: Spoken Dialectal Question Answering for the Real World

Question answering (QA) systems are now available through numerous comme...
research
05/23/2023

Evaluating and Modeling Attribution for Cross-Lingual Question Answering

Trustworthy answer content is abundant in many high-resource languages a...
research
07/31/2023

Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering

Retriever-augmented instruction-following models are attractive alternat...
research
06/11/2019

HEAD-QA: A Healthcare Dataset for Complex Reasoning

We present HEAD-QA, a multi-choice question answering testbed to encoura...

Please sign up or login with your details

Forgot password? Click here to reset