Towards Robust Ranker for Text Retrieval

06/16/2022
by   Yucheng Zhou, et al.
0

A ranker plays an indispensable role in the de facto 'retrieval rerank' pipeline, but its training still lags behind – learning from moderate negatives or/and serving as an auxiliary module for a retriever. In this work, we first identify two major barriers to a robust ranker, i.e., inherent label noises caused by a well-trained retriever and non-ideal negatives sampled for a high-capable ranker. Thereby, we propose multiple retrievers as negative generators improve the ranker's robustness, where i) involving extensive out-of-distribution label noises renders the ranker against each noise distribution, and ii) diverse hard negatives from a joint distribution are relatively close to the ranker's negative distribution, leading to more challenging thus effective training. To evaluate our robust ranker (dubbed R^2anker), we conduct experiments in various settings on the popular passage retrieval benchmark, including BM25-reranking, full-ranking, retriever distillation, etc. The empirical results verify the new state-of-the-art effectiveness of our model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/21/2021

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise

Learning with noisy labels is a practically challenging problem in weakl...
research
11/05/2021

Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences for Image-Text Retrieval

Matching model is essential for Image-Text Retrieval framework. Existing...
research
12/07/2021

RID-Noise: Towards Robust Inverse Design under Noisy Environments

From an engineering perspective, a design should not only perform well i...
research
03/12/2022

Information retrieval for label noise document ranking by bag sampling and group-wise loss

Long Document retrieval (DR) has always been a tremendous challenge for ...
research
06/28/2022

Cooperative Retriever and Ranker in Deep Recommenders

Deep recommender systems jointly leverage the retrieval and ranking oper...
research
07/25/2019

Adaptive Noise Injection: A Structure-Expanding Regularization for RNN

The vanilla LSTM has become one of the most potential architectures in w...
research
10/21/2019

Improving Word Representations: A Sub-sampled Unigram Distribution for Negative Sampling

Word2Vec is the most popular model for word representation and has been ...

Please sign up or login with your details

Forgot password? Click here to reset