A Change-Detection based Framework for Piecewise-stationary Multi-Armed Bandit Problem

11/08/2017
by   Fang Liu, et al.
0

The multi-armed bandit problem has been extensively studied under the stationary assumption. However in reality, this assumption often does not hold because the distributions of rewards themselves may change over time. In this paper, we propose a change-detection (CD) based framework for multi-armed bandit problems under the piecewise-stationary setting, and study a class of change-detection based UCB (Upper Confidence Bound) policies, CD-UCB, that actively detects change points and restarts the UCB indices. We then develop CUSUM-UCB and PHT-UCB, that belong to the CD-UCB class and use cumulative sum (CUSUM) and Page-Hinkley Test (PHT) to detect changes. We show that CUSUM-UCB obtains the best known regret upper bound under mild assumptions. We also demonstrate the regret reduction of the CD-UCB policies over arbitrary Bernoulli rewards and Yahoo! datasets of webpage click-through rates.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2023

Piecewise-Stationary Multi-Objective Multi-Armed Bandit with Application to Joint Communications and Sensing

We study a multi-objective multi-armed bandit problem in a dynamic envir...
research
02/11/2018

Nearly Optimal Adaptive Procedure for Piecewise-Stationary Bandit: a Change-Point Detection Approach

Multi-armed bandit (MAB) is a class of online learning problems where a ...
research
09/11/2019

Practical Calculation of Gittins Indices for Multi-armed Bandits

Gittins indices provide an optimal solution to the classical multi-armed...
research
09/06/2020

A Change-Detection Based Thompson Sampling Framework for Non-Stationary Bandits

We consider a non-stationary two-armed bandit framework and propose a ch...
research
10/04/2018

Adaptive Policies for Perimeter Surveillance Problems

Maximising the detection of intrusions is a fundamental and often critic...
research
07/30/2021

Adaptively Optimize Content Recommendation Using Multi Armed Bandit Algorithms in E-commerce

E-commerce sites strive to provide users the most timely relevant inform...
research
06/08/2023

Decentralized Randomly Distributed Multi-agent Multi-armed Bandit with Heterogeneous Rewards

We study a decentralized multi-agent multi-armed bandit problem in which...

Please sign up or login with your details

Forgot password? Click here to reset