Semantic Models for the First-stage Retrieval: A Comprehensive Review

03/08/2021
by   Yinqiong Cai, et al.
0

Multi-stage ranking pipelines have been a practical solution in modern search systems, where the first-stage retrieval is to return a subset of candidate documents, and the latter stages attempt to re-rank those candidates. Unlike the re-ranking stages going through quick technique shifts during the past decades, the first-stage retrieval has long been dominated by classical term-based models. Unfortunately, these models suffer from the vocabulary mismatch problem, which may block the re-ranking stages from relevant documents at the very beginning. Therefore, it has been a long-term desire to build semantic models for the first-stage retrieval that can achieve high recall efficiently. Recently, we have witnessed an explosive growth of research interests on the first-stage semantic retrieval models. We believe it is the right time to survey the current status, learn from existing methods, and gain some insights for future development. In this paper, we describe the current landscape of semantic retrieval models from three major paradigms, paying special attention to recent neural-based methods. We review the benchmark datasets, optimization methods and evaluation metrics, and summarize the state-of-the-art models. We also discuss the unresolved challenges and suggest potentially promising directions for future work.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/18/2021

A Discriminative Semantic Ranker for Question Retrieval

Similar question retrieval is a core task in community-based question an...
research
03/16/2019

A Deep Look into Neural Ranking Models for Information Retrieval

Ranking models lie at the heart of research on information retrieval (IR...
research
06/28/2018

Beyond Precision: A Study on Recall of Initial Retrieval with Neural Representations

Vocabulary mismatch is a central problem in information retrieval (IR), ...
research
04/24/2021

Learning Passage Impacts for Inverted Indexes

Neural information retrieval systems typically use a cascading pipeline,...
research
10/20/2020

CoRT: Complementary Rankings from Transformers

Recent approaches towards passage retrieval have successfully employed r...
research
03/07/2016

A Two-Stage Shape Retrieval (TSR) Method with Global and Local Features

A robust two-stage shape retrieval (TSR) method is proposed to address t...
research
05/15/2023

Efficient and Effective Tree-based and Neural Learning to Rank

This monograph takes a step towards promoting the study of efficiency in...

Please sign up or login with your details

Forgot password? Click here to reset