Online Learning to Rank in Stochastic Click Models

03/07/2017
by   Masrour Zoghi, et al.
0

Online learning to rank is a core problem in information retrieval and machine learning. Many provably efficient algorithms have been recently proposed for this problem in specific click models. The click model is a model of how the user interacts with a list of documents. Though these results are significant, their impact on practice is limited, because all proposed algorithms are designed for specific click models and lack convergence guarantees in other models. In this work, we propose BatchRank, the first online learning to rank algorithm for a broad class of click models. The class encompasses two most fundamental click models, the cascade and position-based models. We derive a gap-dependent upper bound on the T-step regret of BatchRank and evaluate it on a range of web search queries. We observe that BatchRank outperforms ranked bandits and is more robust than CascadeKL-UCB, an existing algorithm for the cascade model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2019

Cascading Non-Stationary Bandits: Online Learning to Rank in the Non-Stationary Cascade Model

Non-stationarity appears in many online applications such as web search ...
research
06/06/2018

TopRank: A practical algorithm for online stochastic ranking

Online learning to rank is a sequential decision-making problem where in...
research
01/21/2020

TopRank+: A Refinement of TopRank Algorithm

Online learning to rank is a core problem in machine learning. In Lattim...
research
11/26/2017

Balancing Speed and Quality in Online Learning to Rank for Information Retrieval

In Online Learning to Rank (OLTR) the aim is to find an optimal ranking ...
research
02/09/2016

DCM Bandits: Learning to Rank with Multiple Clicks

A search engine recommends to the user a list of web pages. The user exa...
research
08/10/2016

Stochastic Rank-1 Bandits

We propose stochastic rank-1 bandits, a class of online learning problem...
research
05/26/2023

Adversarial Attacks on Online Learning to Rank with Click Feedback

Online learning to rank (OLTR) is a sequential decision-making problem w...

Please sign up or login with your details

Forgot password? Click here to reset