Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval

03/27/2023
by   Houxing Ren, et al.
0

In monolingual dense retrieval, lots of works focus on how to distill knowledge from cross-encoder re-ranker to dual-encoder retriever and these methods achieve better performance due to the effectiveness of cross-encoder re-ranker. However, we find that the performance of the cross-encoder re-ranker is heavily influenced by the number of training samples and the quality of negative samples, which is hard to obtain in the cross-lingual setting. In this paper, we propose to use a query generator as the teacher in the cross-lingual setting, which is less dependent on enough training samples and high-quality negative samples. In addition to traditional knowledge distillation, we further propose a novel enhancement method, which uses the query generator to help the dual-encoder align queries from different languages, but does not need any additional parallel sentences. The experimental results show that our method outperforms the state-of-the-art methods on two benchmark datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/06/2023

Augmenting Passage Representations with Query Generation for Enhanced Cross-Lingual Dense Retrieval

Effective cross-lingual dense retrieval methods that rely on multilingua...
research
12/07/2021

Improving Neural Cross-Lingual Summarization via Employing Optimal Transport Distance for Knowledge Distillation

Current state-of-the-art cross-lingual summarization models employ multi...
research
08/29/2019

Translate and Label! An Encoder-Decoder Approach for Cross-lingual Semantic Role Labeling

We propose a Cross-lingual Encoder-Decoder model that simultaneously tra...
research
09/13/2022

Multi-stage Distillation Framework for Cross-Lingual Semantic Similarity Matching

Previous studies have proved that cross-lingual knowledge distillation c...
research
05/08/2023

Retriever and Ranker Framework with Probabilistic Hard Negative Sampling for Code Search

Pretrained Language Models (PLMs) have emerged as the state-of-the-art p...
research
08/13/2021

PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval

Recently, dense passage retrieval has become a mainstream approach to fi...
research
10/07/2021

Adversarial Retriever-Ranker for dense text retrieval

Current dense text retrieval models face two typical challenges. First, ...

Please sign up or login with your details

Forgot password? Click here to reset