On the Estimation and Use of Statistical Modelling in Information Retrieval

03/30/2019
by   Casper Petersen, et al.
0

Several tasks in information retrieval (IR) rely on assumptions regarding the distribution of some property (such as term frequency) in the data being processed. This thesis argues that such distributional assumptions can lead to incorrect conclusions and proposes a statistically principled method for determining the "true" distribution. This thesis further applies this method to derive a new family of ranking models that adapt their computations to the statistics of the data being processed. Experimental evaluation shows results on par or better than multiple strong baselines on several TREC collections. Overall, this thesis concludes that distributional assumptions can be replaced with an effective, efficient and principled method for determining the "true" distribution and that using the "true" distribution can lead to improved retrieval performance.

READ FULL TEXT
research
06/04/2017

Improving Legal Information Retrieval by Distributional Composition with Term Order Probabilities

Legal professionals worldwide are currently trying to get up-to-pace wit...
research
06/21/2023

Resources and Evaluations for Multi-Distribution Dense Information Retrieval

We introduce and define the novel problem of multi-distribution informat...
research
09/29/2022

Multi-stage Information Retrieval for Vietnamese Legal Texts

This study deals with the problem of information retrieval (IR) for Viet...
research
04/02/2023

An Intrinsic Framework of Information Retrieval Evaluation Measures

Information retrieval (IR) evaluation measures are cornerstones for dete...
research
01/10/2022

Continual Learning of Long Topic Sequences in Neural Information Retrieval

In information retrieval (IR) systems, trends and users' interests may c...
research
10/09/2018

Caracterización Formal y Análisis Empírico de Mecanismos Incrementales de Búsqueda basados en Contexto

The Web has become a potentially infinite information resource, turning ...
research
10/01/2020

Evaluating a Generative Adversarial Framework for Information Retrieval

Recent advances in Generative Adversarial Networks (GANs) have resulted ...

Please sign up or login with your details

Forgot password? Click here to reset