Simple Yet Effective Neural Ranking and Reranking Baselines for Cross-Lingual Information Retrieval

04/03/2023
by   Jimmy Lin, et al.
0

The advent of multilingual language models has generated a resurgence of interest in cross-lingual information retrieval (CLIR), which is the task of searching documents in one language with queries from another. However, the rapid pace of progress has led to a confusing panoply of methods and reproducibility has lagged behind the state of the art. In this context, our work makes two important contributions: First, we provide a conceptual framework for organizing different approaches to cross-lingual retrieval using multi-stage architectures for mono-lingual retrieval as a scaffold. Second, we implement simple yet effective reproducible baselines in the Anserini and Pyserini IR toolkits for test collections from the TREC 2022 NeuCLIR Track, in Persian, Russian, and Chinese. Our efforts are built on a collaboration of the two teams that submitted the most effective runs to the TREC evaluation. These contributions provide a firm foundation for future advances.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/06/2023

Augmenting Passage Representations with Query Generation for Enhanced Cross-Lingual Dense Retrieval

Effective cross-lingual dense retrieval methods that rely on multilingua...
research
05/26/2020

A Study of Neural Matching Models for Cross-lingual IR

In this study, we investigate interaction-based neural matching models f...
research
12/15/2021

Learning Cross-Lingual IR from an English Retriever

We present a new cross-lingual information retrieval (CLIR) model traine...
research
11/02/2020

Cross-Lingual Document Retrieval with Smooth Learning

Cross-lingual document search is an information retrieval task in which ...
research
05/11/2018

Cross-lingual Document Retrieval using Regularized Wasserstein Distance

Many information retrieval algorithms rely on the notion of a good dista...
research
10/30/2020

Embedding Meta-Textual Information for Improved Learning to Rank

Neural approaches to learning term embeddings have led to improved compu...
research
09/30/2019

Simple and Effective Paraphrastic Similarity from Parallel Translations

We present a model and methodology for learning paraphrastic sentence em...

Please sign up or login with your details

Forgot password? Click here to reset