An Axiomatic Study of Query Terms Order in Ad-hoc Retrieval

by   Ayyoob Imani, et al.

Classic retrieval methods use simple bag-of-word representations for queries and documents. This representation fails to capture the full semantic richness of queries and documents. More recent retrieval models have tried to overcome this deficiency by using approaches such as incorporating dependencies between query terms, using bi-gram representations of documents, proximity heuristics, and passage retrieval. While some of these previous works have implicitly accounted for term order, to the best of our knowledge, term order has not been the primary focus of any research. In this paper, we focus solely on the effect of term order in information retrieval. We will show that documents that have two query terms in the same order as in the query have a higher probability of being relevant than documents that have two query terms in the reverse order. Using the axiomatic framework for information retrieval, we introduce a constraint that retrieval models must adhere to in order to effectively utilize term order dependency among query terms. We modify existing retrieval models based on this constraint so that if the order of a pair of query terms is semantically important, a document that includes these query terms in the same order as the query should receive a higher score compared to a document that includes them in the reverse order. Our empirical evaluation using both TREC newswire and web corpora demonstrates that the modified retrieval models significantly outperform their original counterparts.



There are no comments yet.


page 1

page 2

page 3

page 4


Experiments on Manual Thesaurus based Query Expansion for Ad-hoc Monolingual Gujarati Information Retrieval Tasks

In this paper, we present the experimental work done on Query Expansion ...

Remedies against the Vocabulary Gap in Information Retrieval

Search engines rely heavily on term-based approaches that represent quer...

A novel model for query expansion using pseudo-relevant web knowledge

In the field of information retrieval, query expansion (QE) has long bee...

Investigating Retrieval Method Selection with Axiomatic Features

We consider algorithm selection in the context of ad-hoc information ret...

WIKIR: A Python toolkit for building a large-scale Wikipedia-based English Information Retrieval Dataset

Over the past years, deep learning methods allowed for new state-of-the-...

Wacky Weights in Learned Sparse Representations and the Revenge of Score-at-a-Time Query Evaluation

Recent advances in retrieval models based on learned sparse representati...

MIRA: Leveraging Multi-Intention Co-click Information in Web-scale Document Retrieval using Deep Neural Networks

We study the problem of deep recall model in industrial web search, whic...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.