MIA 2022 Shared Task Submission: Leveraging Entity Representations, Dense-Sparse Hybrids, and Fusion-in-Decoder for Cross-Lingual Question Answering

07/05/2022
by   Zhucheng Tu, et al.
0

We describe our two-stage system for the Multilingual Information Access (MIA) 2022 Shared Task on Cross-Lingual Open-Retrieval Question Answering. The first stage consists of multilingual passage retrieval with a hybrid dense and sparse retrieval strategy. The second stage consists of a reader which outputs the answer from the top passages returned by the first stage. We show the efficacy of using a multilingual language model with entity representations in pretraining, sparse retrieval signals to help dense retrieval, and Fusion-in-Decoder. On the development set, we obtain 43.46 F1 on XOR-TyDi QA and 21.99 F1 on MKQA, for an average F1 score of 32.73. On the test set, we obtain 40.93 F1 on XOR-TyDi QA and 22.29 F1 on MKQA, for an average F1 score of 31.61. We improve over the official baseline by over 4 F1 points on both the development and test sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2020

XOR QA: Cross-lingual Open-Retrieval Question Answering

Multilingual question answering tasks typically assume answers exist in ...
research
04/06/2023

Bridging the Language Gap: Knowledge Injected Multilingual Question Answering

Question Answering (QA) is the task of automatically answering questions...
research
12/11/2019

Automatic Spanish Translation of the SQuAD Dataset for Multilingual Question Answering

Recently, multilingual question answering became a crucial research topi...
research
09/28/2018

Direct optimization of F-measure for retrieval-based personal question answering

Recent advances in spoken language technologies and the introduction of ...
research
05/30/2022

ZusammenQA: Data Augmentation with Specialized Models for Cross-lingual Open-retrieval Question Answering System

This paper introduces our proposed system for the MIA Shared Task on Cro...
research
10/19/2021

DEEPAGÉ: Answering Questions in Portuguese about the Brazilian Environment

The challenge of climate change and biome conservation is one of the mos...
research
06/05/2020

Spoken dialect identification in Twitter using a multi-filter architecture

This paper presents our approach for SwissText KONVENS 2020 shared t...

Please sign up or login with your details

Forgot password? Click here to reset