Sequential Monte Carlo Bandits

10/04/2013
by   Michael Cherkassky, et al.
0

In this paper we propose a flexible and efficient framework for handling multi-armed bandits, combining sequential Monte Carlo algorithms with hierarchical Bayesian modeling techniques. The framework naturally encompasses restless bandits, contextual bandits, and other bandit variants under a single inferential model. Despite the model's generality, we propose efficient Monte Carlo algorithms to make inference scalable, based on recent developments in sequential Monte Carlo methods. Through two simulation studies, the framework is shown to outperform other empirical methods, while also naturally scaling to more complex problems for which existing approaches can not cope. Additionally, we successfully apply our framework to online video-based advertising recommendation, and show its increased efficacy as compared to current state of the art bandit algorithms.

READ FULL TEXT
research
12/19/2018

Inference with Hamiltonian Sequential Monte Carlo Simulators

The paper proposes a new Monte-Carlo simulator combining the advantages ...
research
02/08/2021

Monte Carlo Rollout Policy for Recommendation Systems with Dynamic User Behavior

We model online recommendation systems using the hidden Markov multi-sta...
research
07/30/2021

Indexability and Rollout Policy for Multi-State Partially Observable Restless Bandits

Restless multi-armed bandits with partially observable states has applic...
research
06/30/2015

Scalable Discrete Sampling as a Multi-Armed Bandit Problem

Drawing a sample from a discrete distribution is one of the building com...
research
08/23/2012

Monte Carlo Search Algorithm Discovery for One Player Games

Much current research in AI and games is being devoted to Monte Carlo se...
research
08/09/2014

Selecting Computations: Theory and Applications

Sequential decision problems are often approximately solvable by simulat...
research
10/11/2019

Nonparametric Bayesian multi-armed bandits for single cell experiment design

The problem of maximizing cell type discovery under budget constraints i...

Please sign up or login with your details

Forgot password? Click here to reset