Cross-Lingual Document Retrieval with Smooth Learning

11/02/2020
by   Jiapeng Liu, et al.
0

Cross-lingual document search is an information retrieval task in which the queries' language differs from the documents' language. In this paper, we study the instability of neural document search models and propose a novel end-to-end robust framework that achieves improved performance in cross-lingual search with different documents' languages. This framework includes a novel measure of the relevance, smooth cosine similarity, between queries and documents, and a novel loss function, Smooth Ordinal Search Loss, as the objective. We further provide theoretical guarantee on the generalization error bound for the proposed framework. We conduct experiments to compare our approach with other document search models, and observe significant gains under commonly used ranking metrics on the cross-lingual document retrieval task in a variety of languages.

READ FULL TEXT
research
06/08/2019

Improving Low-Resource Cross-lingual Document Retrieval by Reranking with Deep Bilingual Representations

In this paper, we propose to boost low-resource cross-lingual document r...
research
05/11/2018

Cross-lingual Document Retrieval using Regularized Wasserstein Distance

Many information retrieval algorithms rely on the notion of a good dista...
research
11/08/2019

Cross-Lingual Relevance Transfer for Document Retrieval

Recent work has shown the surprising ability of multi-lingual BERT to se...
research
03/28/2023

NeuralMind-UNICAMP at 2022 TREC NeuCLIR: Large Boring Rerankers for Cross-lingual Retrieval

This paper reports on a study of cross-lingual information retrieval (CL...
research
11/03/2021

Leveraging Advantages of Interactive and Non-Interactive Models for Vector-Based Cross-Lingual Information Retrieval

Interactive and non-interactive model are the two de-facto standard fram...
research
07/29/2017

Bilingual Document Alignment with Latent Semantic Indexing

We apply cross-lingual Latent Semantic Indexing to the Bilingual Documen...
research
04/03/2023

Simple Yet Effective Neural Ranking and Reranking Baselines for Cross-Lingual Information Retrieval

The advent of multilingual language models has generated a resurgence of...

Please sign up or login with your details

Forgot password? Click here to reset