Position-Based Multiple-Play Bandits with Thompson Sampling

Multiple-play bandits aim at displaying relevant items at relevant positions on a web page. We introduce a new bandit-based algorithm, PB-MHB, for online recommender systems which uses the Thompson sampling framework. This algorithm handles a display setting governed by the position-based model. Our sampling method does not require as input the probability of a user to look at a given position in the web page which is, in practice, very difficult to obtain. Experiments on simulated and real datasets show that our method, with fewer prior information, deliver better recommendations than state-of-the-art algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/07/2018

Deep Reinforcement Learning for Page-wise Recommendations

Recommender systems can mitigate the information overload problem by sug...
research
02/19/2018

A Study of Position Bias in Digital Library Recommender Systems

"Position bias" describes the tendency of users to interact with items o...
research
08/03/2020

Deep Bayesian Bandits: Exploring in Online Personalized Recommendations

Recommender systems trained in a continuous learning fashion are plagued...
research
06/26/2023

Scalable Neural Contextual Bandit for Recommender Systems

High-quality recommender systems ought to deliver both innovative and re...
research
07/16/2020

Fast Distributed Bandits for Online Recommendation Systems

Contextual bandit algorithms are commonly used in recommender systems, w...
research
11/25/2019

Minimax Optimal Algorithms for Adversarial Bandit Problem with Multiple Plays

We investigate the adversarial bandit problem with multiple plays under ...
research
11/12/2021

Fully Automatic Page Turning on Real Scores

We present a prototype of an automatic page turning system that works di...

Please sign up or login with your details

Forgot password? Click here to reset