Teaching a New Dog Old Tricks: Resurrecting Multilingual Retrieval Using Zero-shot Learning

12/30/2019
by   Sean MacAvaney, et al.
0

While billions of non-English speaking users rely on search engines every day, the problem of ad-hoc information retrieval is rarely studied for non-English languages. This is primarily due to a lack of data set that are suitable to train ranking algorithms. In this paper, we tackle the lack of data by leveraging pre-trained multilingual language models to transfer a retrieval system trained on English collections to non-English queries and documents. Our model is evaluated in a zero-shot setting, meaning that we use them to predict relevance scores for query-document pairs in languages never seen during training. Our results show that the proposed approach can significantly outperform unsupervised retrieval techniques for Arabic, Chinese Mandarin, and Spanish. We also show that augmenting the English training collection with some examples from the target language can sometimes improve performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/30/2021

Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

Despite their success, large pre-trained multilingual models have not co...
research
08/24/2022

Improving video retrieval using multilingual knowledge transfer

Video retrieval has seen tremendous progress with the development of vis...
research
05/10/2023

Evaluating Embedding APIs for Information Retrieval

The ever-increasing size of language models curtails their widespread ac...
research
04/29/2020

Zero-shot Neural Retrieval via Domain-targeted Synthetic Query Generation

Deep neural scoring models have recently been shown to improve ranking q...
research
08/05/2022

A Semantic Alignment System for Multilingual Query-Product Retrieval

This paper mainly describes our winning solution (team name: www) to Ama...
research
05/15/2023

Soft Prompt Decoding for Multilingual Dense Retrieval

In this work, we explore a Multilingual Information Retrieval (MLIR) tas...
research
05/12/2022

Controlling Formality in Low-Resource NMT with Domain Adaptation and Re-Ranking: SLT-CDT-UoS at IWSLT2022

This paper describes the SLT-CDT-UoS group's submission to the first Spe...

Please sign up or login with your details

Forgot password? Click here to reset