An Optimization-based Algorithm for Non-stationary Kernel Bandits without Prior Knowledge

05/29/2022
by   Kihyuk Hong, et al.
0

We propose an algorithm for non-stationary kernel bandits that does not require prior knowledge of the degree of non-stationarity. The algorithm follows randomized strategies obtained by solving optimization problems that balance exploration and exploitation. It adapts to non-stationarity by restarting when a change in the reward function is detected. Our algorithm enjoys a tighter dynamic regret bound than previous work on the non-stationary kernel bandit setting. Moreover, when applied to the non-stationary linear bandit setting by using a linear kernel, our algorithm is nearly minimax optimal, solving an open problem in the non-stationary linear bandit literature. We extend our algorithm to use a neural network for dynamically adapting the feature mapping to observed data. We prove a dynamic regret bound of the extension using the neural tangent kernel theory. We demonstrate empirically that our algorithm and the extension can adapt to varying degrees of non-stationarity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/23/2023

A Definition of Non-Stationary Bandits

The subject of non-stationary bandit learning has attracted much recent ...
research
03/09/2021

Regret Bounds for Generalized Linear Bandits under Parameter Drift

Generalized Linear Bandits (GLBs) are powerful extensions to the Linear ...
research
10/11/2022

On Adaptivity in Non-stationary Stochastic Optimization With Bandit Feedback

In this paper we study the non-stationary stochastic optimization questi...
research
10/22/2021

Break your Bandit Routine with LSD Rewards: a Last Switch Dependent Analysis of Satiation and Seasonality

Motivated by the fact that humans like some level of unpredictability or...
research
03/12/2023

Energy Regularized RNNs for Solving Non-Stationary Bandit Problems

We consider a Multi-Armed Bandit problem in which the rewards are non-st...
research
02/04/2005

Oiling the Wheels of Change: The Role of Adaptive Automatic Problem Decomposition in Non--Stationary Environments

Genetic algorithms (GAs) that solve hard problems quickly, reliably and ...
research
02/10/2020

Combinatorial Semi-Bandit in the Non-Stationary Environment

In this paper, we investigate the non-stationary combinatorial semi-band...

Please sign up or login with your details

Forgot password? Click here to reset