Sensitive and Scalable Online Evaluation with Theoretical Guarantees

11/26/2017
by   Harrie Oosterhuis, et al.
0

Multileaved comparison methods generalize interleaved comparison methods to provide a scalable approach for comparing ranking systems based on regular user interactions. Such methods enable the increasingly rapid research and development of search engines. However, existing multileaved comparison methods that provide reliable outcomes do so by degrading the user experience during evaluation. Conversely, current multileaved comparison methods that maintain the user experience cannot guarantee correctness. Our contribution is two-fold. First, we propose a theoretical framework for systematically comparing multileaved comparison methods using the notions of considerateness, which concerns maintaining the user experience, and fidelity, which concerns reliable correct outcomes. Second, we introduce a novel multileaved comparison method, Pairwise Preference Multileaving (PPM), that performs comparisons based on document-pair preferences, and prove that it is considerate and has fidelity. We show empirically that, compared to previous multileaved comparison methods, PPM is more sensitive to user preferences and scalable with the number of rankers being compared.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/16/2023

Comparing Conventional and Conversational Search Interaction using Implicit Evaluation Methods

Conversational search applications offer the prospect of improved user e...
research
02/19/2015

Just Sort It! A Simple and Effective Approach to Active Preference Learning

We address the problem of learning a ranking by using adaptively chosen ...
research
09/28/2021

Dynamic Ranking with the BTL Model: A Nearest Neighbor based Rank Centrality Method

Many applications such as recommendation systems or sports tournaments i...
research
12/11/2018

Merge Double Thompson Sampling for Large Scale Online Ranker Evaluation

Online ranker evaluation is one of the key challenges in information ret...
research
03/19/2011

Refining Recency Search Results with User Click Feedback

Traditional machine-learned ranking systems for web search are often tra...
research
09/22/2018

Differentiable Unbiased Online Learning to Rank

Online Learning to Rank (OLTR) methods optimize rankers based on user in...
research
07/06/2023

Finding Favourite Tuples on Data Streams with Provably Few Comparisons

One of the most fundamental tasks in data science is to assist a user wi...

Please sign up or login with your details

Forgot password? Click here to reset