Context Aware Query Rewriting for Text Rankers using LLM

08/31/2023
by   Abhijit Anand, et al.
0

Query rewriting refers to an established family of approaches that are applied to underspecified and ambiguous queries to overcome the vocabulary mismatch problem in document ranking. Queries are typically rewritten during query processing time for better query modelling for the downstream ranker. With the advent of large-language models (LLMs), there have been initial investigations into using generative approaches to generate pseudo documents to tackle this inherent vocabulary gap. In this work, we analyze the utility of LLMs for improved query rewriting for text ranking tasks. We find that there are two inherent limitations of using LLMs as query re-writers – concept drift when using only queries as prompts and large inference costs during query processing. We adopt a simple, yet surprisingly effective, approach called context aware query rewriting (CAR) to leverage the benefits of LLMs for query understanding. Firstly, we rewrite ambiguous training queries by context-aware prompting of LLMs, where we use only relevant documents as context.Unlike existing approaches, we use LLM-based query rewriting only during the training phase. Eventually, a ranker is fine-tuned on the rewritten queries instead of the original queries during training. In our extensive experiments, we find that fine-tuning a ranker using re-written queries offers a significant improvement of up to 33 document ranking task when compared to the baseline performance of using original queries.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2021

Co-BERT: A Context-Aware BERT Retrieval Model Incorporating Local and Query-specific Context

BERT-based text ranking models have dramatically advanced the state-of-t...
research
08/09/2021

IntenT5: Search Result Diversification using Causal Language Models

Search result diversification is a beneficial approach to overcome under...
research
10/23/2019

Context-Aware Sentence/Passage Term Importance Estimation For First Stage Retrieval

Term frequency is a common method for identifying the importance of a te...
research
05/05/2023

Query Expansion by Prompting Large Language Models

Query expansion is a widely used technique to improve the recall of sear...
research
07/08/2015

A Hierarchical Recurrent Encoder-Decoder For Generative Context-Aware Query Suggestion

Users may strive to formulate an adequate textual query for their inform...
research
10/27/2022

QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation

Large Language Models (LLMs) have shown impressive results on a variety ...
research
02/12/2019

A Domain Generalization Perspective on Listwise Context Modeling

As one of the most popular techniques for solving the ranking problem in...

Please sign up or login with your details

Forgot password? Click here to reset