Cross-lingual Information Retrieval with BERT

04/24/2020
by   Zhuolin Jiang, et al.
33

Multiple neural language models have been developed recently, e.g., BERT and XLNet, and achieved impressive results in various NLP tasks including sentence classification, question answering and document ranking. In this paper, we explore the use of the popular bidirectional language model, BERT, to model and learn the relevance between English queries and foreign-language documents in the task of cross-lingual information retrieval. A deep relevance matching model based on BERT is introduced and trained by finetuning a pretrained multilingual BERT model with weak supervision, using home-made CLIR training data derived from parallel corpora. Experimental results of the retrieval of Lithuanian documents against short English queries show that our model is effective and outperforms the competitive baseline approaches.

READ FULL TEXT

page 2

page 3

research
11/08/2019

Cross-Lingual Relevance Transfer for Document Retrieval

Recent work has shown the surprising ability of multi-lingual BERT to se...
research
11/03/2021

Leveraging Advantages of Interactive and Non-Interactive Models for Vector-Based Cross-Lingual Information Retrieval

Interactive and non-interactive model are the two de-facto standard fram...
research
05/01/2023

Retrieving Comparative Arguments using Ensemble Methods and Neural Information Retrieval

In this paper, we present a submission to the Touche lab's Task 2 on Arg...
research
12/27/2021

Mind the Gap: Cross-Lingual Information Retrieval with Hierarchical Knowledge Enhancement

Cross-Lingual Information Retrieval (CLIR) aims to rank the documents wr...
research
05/17/2020

Cross-Lingual Low-Resource Set-to-Description Retrieval for Global E-Commerce

With the prosperous of cross-border e-commerce, there is an urgent deman...
research
07/17/2020

Multi-Perspective Semantic Information Retrieval in the Biomedical Domain

Information Retrieval (IR) is the task of obtaining pieces of data (such...
research
10/27/2019

Thieves on Sesame Street! Model Extraction of BERT-based APIs

We study the problem of model extraction in natural language processing,...

Please sign up or login with your details

Forgot password? Click here to reset