Kolmogorov-Smirnov Test-Based Actively-Adaptive Thompson Sampling for Non-Stationary Bandits

05/30/2021
by   Gourab Ghatak, et al.
0

We consider the non-stationary multi-armed bandit (MAB) framework and propose a Kolmogorov-Smirnov (KS) test based Thompson Sampling (TS) algorithm named TS-KS, that actively detects change points and resets the TS parameters once a change is detected. In particular, for the two-armed bandit case, we derive bounds on the number of samples of the reward distribution to detect the change once it occurs. Consequently, we show that the proposed algorithm has sub-linear regret. Contrary to existing works, our algorithm is able to detect a change when the underlying reward distribution changes even though the mean reward remains the same. Finally, to test the efficacy of the proposed algorithm, we employ it in two case-studies: i) task-offloading scenario in wireless edge-computing, and ii) portfolio optimization. Our results show that the proposed TS-KS algorithm outperforms not only the static TS algorithm but also it performs better than other bandit algorithms designed for non-stationary environments. Moreover, the performance of TS-KS is at par with the state-of-the-art forecasting algorithms such as Facebook-PROPHET and ARIMA.

READ FULL TEXT

page 1

page 8

research
09/06/2020

A Change-Detection Based Thompson Sampling Framework for Non-Stationary Bandits

We consider a non-stationary two-armed bandit framework and propose a ch...
research
05/20/2022

Actively Tracking the Optimal Arm in Non-Stationary Environments with Mandatory Probing

We study a novel multi-armed bandit (MAB) setting which mandates the age...
research
07/31/2017

Taming Non-stationary Bandits: A Bayesian Approach

We consider the multi armed bandit problem in non-stationary environment...
research
04/22/2020

Adaptive Operator Selection Based on Dynamic Thompson Sampling for MOEA/D

In evolutionary computation, different reproduction operators have vario...
research
02/05/2019

The Generalized Likelihood Ratio Test meets klUCB: an Improved Algorithm for Piece-Wise Non-Stationary Bandits

We propose a new algorithm for the piece-wise non-stationary bandit pro...
research
08/20/2019

How to gamble with non-stationary X-armed bandits and have no regrets

In X-armed bandit problem an agent sequentially interacts with environme...
research
06/22/2020

An Online Algorithm for Computation Offloading in Non-Stationary Environments

We consider the latency minimization problem in a task-offloading scenar...

Please sign up or login with your details

Forgot password? Click here to reset