Unreliable Multi-Armed Bandits: A Novel Approach to Recommendation Systems

11/14/2019
by   Aditya Narayan Ravi, et al.
0

We use a novel modification of Multi-Armed Bandits to create a new model for recommendation systems. We model the recommendation system as a bandit seeking to maximize reward by pulling on arms with unknown rewards. The catch however is that this bandit can only access these arms through an unreliable intermediate that has some level of autonomy while choosing its arms. For example, in a streaming website the user has a lot of autonomy while choosing content they want to watch. The streaming sites can use targeted advertising as a means to bias opinions of these users. Here the streaming site is the bandit aiming to maximize reward and the user is the unreliable intermediate. We model the intermediate as accessing states via a Markov chain. The bandit is allowed to perturb this Markov chain. We prove fundamental theorems for this setting after which we show a close-to-optimal Explore-Commit algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/20/2023

Multi armed bandits and quantum channel oracles

Multi armed bandits are one of the theoretical pillars of reinforcement ...
research
11/29/2018

Regret Bounds for Stochastic Combinatorial Multi-Armed Bandits with Linear Space Complexity

Many real-world problems face the dilemma of choosing best K out of N op...
research
07/21/2023

Bandits with Deterministically Evolving States

We propose a model for learning with bandit feedback while accounting fo...
research
03/03/2021

Fairness of Exposure in Stochastic Bandits

Contextual bandit algorithms have become widely used for recommendation ...
research
05/14/2020

Thompson Sampling for Combinatorial Semi-bandits with Sleeping Arms and Long-Term Fairness Constraints

We study the combinatorial sleeping multi-armed semi-bandit problem with...
research
12/09/2022

Networked Restless Bandits with Positive Externalities

Restless multi-armed bandits are often used to model budget-constrained ...
research
02/02/2023

Learning with Exposure Constraints in Recommendation Systems

Recommendation systems are dynamic economic systems that balance the nee...

Please sign up or login with your details

Forgot password? Click here to reset