A Near-Optimal Change-Detection Based Algorithm for Piecewise-Stationary Combinatorial Semi-Bandits

08/27/2019
by   Huozhi Zhou, et al.
0

We investigate the piecewise-stationary combinatorial semi-bandit problem. Compared to the original combinatorial semi-bandit problem, our setting assumes the reward distributions of base arms may change in a piecewise-stationary manner at unknown time steps. We propose an algorithm, GLR-CUCB, which incorporates an efficient combinatorial semi-bandit algorithm, CUCB, with an almost parameter-free change-point detector, the Generalized Likelihood Ratio Test (GLRT). Our analysis shows that the regret of GLR-CUCB is upper bounded by O(√(NKTT)), where N is the number of piecewise-stationary segments, K is the number of base arms, and T is the number of time steps. As a complement, we also derive a nearly matching regret lower bound on the order of Ω(√(NKT)), for both piecewise-stationary multi-armed bandits and combinatorial semi-bandits, using information-theoretic techniques and judiciously constructed piecewise-stationary bandit instances. Our lower bound is tighter than the best available regret lower bound, which is Ω(√(T)). Numerical experiments on both synthetic and real-world datasets demonstrate the superiority of GLR-CUCB compared to other state-of-the-art algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2019

Be Aware of Non-Stationarity: Nearly Optimal Algorithms for Piecewise-Stationary Cascading Bandits

Cascading bandit (CB) is a variant of both the multi-armed bandit (MAB) ...
research
02/11/2018

Nearly Optimal Adaptive Procedure for Piecewise-Stationary Bandit: a Change-Point Detection Approach

Multi-armed bandit (MAB) is a class of online learning problems where a ...
research
06/09/2023

Distributed Consensus Algorithm for Decision-Making in Multi-agent Multi-armed Bandit

We study a structured multi-agent multi-armed bandit (MAMAB) problem in ...
research
07/26/2023

Piecewise-Stationary Combinatorial Semi-Bandit with Causally Related Rewards

We study the piecewise stationary combinatorial semi-bandit problem with...
research
02/05/2019

The Generalized Likelihood Ratio Test meets klUCB: an Improved Algorithm for Piece-Wise Non-Stationary Bandits

We propose a new algorithm for the piece-wise non-stationary bandit pro...
research
05/30/2019

Distribution-dependent and Time-uniform Bounds for Piecewise i.i.d Bandits

We consider the setup of stochastic multi-armed bandits in the case when...
research
05/27/2022

Safety Aware Changepoint Detection for Piecewise i.i.d. Bandits

In this paper, we consider the setting of piecewise i.i.d. bandits under...

Please sign up or login with your details

Forgot password? Click here to reset