DCM Bandits: Learning to Rank with Multiple Clicks

02/09/2016
by   Sumeet Katariya, et al.
0

A search engine recommends to the user a list of web pages. The user examines this list, from the first page to the last, and clicks on all attractive pages until the user is satisfied. This behavior of the user can be described by the dependent click model (DCM). We propose DCM bandits, an online learning variant of the DCM where the goal is to maximize the probability of recommending satisfactory items, such as web pages. The main challenge of our learning problem is that we do not observe which attractive item is satisfactory. We propose a computationally-efficient learning algorithm for solving our problem, dcmKL-UCB; derive gap-dependent upper bounds on its regret under reasonable assumptions; and also prove a matching lower bound up to logarithmic factors. We evaluate our algorithm on synthetic and real-world problems, and show that it performs well even when our model is misspecified. This work presents the first practical and regret-optimal online algorithm for learning to rank with multiple clicks in a cascade-like click model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2015

Cascading Bandits: Learning to Rank in the Cascade Model

A search engine usually outputs a list of K web pages. The user examines...
research
03/17/2016

Cascading Bandits for Large-Scale Recommendation Problems

Most recommender systems recommend a list of items. The user examines th...
research
11/01/2018

Online Diverse Learning to Rank from Partial-Click Feedback

Learning to rank is an important problem in machine learning and recomme...
research
03/07/2017

Online Learning to Rank in Stochastic Click Models

Online learning to rank is a core problem in information retrieval and m...
research
08/10/2016

Stochastic Rank-1 Bandits

We propose stochastic rank-1 bandits, a class of online learning problem...
research
09/13/2021

Online Learning of Optimally Diverse Rankings

Search engines answer users' queries by listing relevant items (e.g. doc...
research
09/07/2020

Learning to Rank under Multinomial Logit Choice

Learning the optimal ordering of content is an important challenge in we...

Please sign up or login with your details

Forgot password? Click here to reset