Non-stationary Bandits with Knapsacks

05/25/2022
by   Shang Liu, et al.
0

In this paper, we study the problem of bandits with knapsacks (BwK) in a non-stationary environment. The BwK problem generalizes the multi-arm bandit (MAB) problem to model the resource consumption associated with playing each arm. At each time, the decision maker/player chooses to play an arm, and s/he will receive a reward and consume certain amount of resource from each of the multiple resource types. The objective is to maximize the cumulative reward over a finite horizon subject to some knapsack constraints on the resources. Existing works study the BwK problem under either a stochastic or adversarial environment. Our paper considers a non-stationary environment which continuously interpolates between these two extremes. We first show that the traditional notion of variation budget is insufficient to characterize the non-stationarity of the BwK problem for a sublinear regret due to the presence of the constraints, and then we propose a new notion of global non-stationarity measure. We employ both non-stationarity measures to derive upper and lower bounds for the problem. Our results are based on a primal-dual analysis of the underlying linear programs and highlight the interplay between the constraints and the non-stationarity. Finally, we also extend the non-stationarity measure to the problem of online convex optimization with constraints and obtain new regret bounds accordingly.

READ FULL TEXT
research
01/29/2023

Smooth Non-Stationary Bandits

In many applications of online decision making, the environment is non-s...
research
02/23/2023

A Definition of Non-Stationary Bandits

The subject of non-stationary bandit learning has attracted much recent ...
research
12/24/2020

A Regret bound for Non-stationary Multi-Armed Bandits with Fairness Constraints

The multi-armed bandits' framework is the most common platform to study ...
research
07/20/2013

Non-stationary Stochastic Optimization

We consider a non-stationary variant of a sequential stochastic optimiza...
research
07/10/2023

Online Ad Procurement in Non-stationary Autobidding Worlds

Today's online advertisers procure digital ad impressions through intera...
research
12/13/2020

Online Stochastic Optimization with Wasserstein Based Non-stationarity

We consider a general online stochastic optimization problem with multip...
research
06/24/2023

On Convex Data-Driven Inverse Optimal Control for Nonlinear, Non-stationary and Stochastic Systems

This paper is concerned with a finite-horizon inverse control problem, w...

Please sign up or login with your details

Forgot password? Click here to reset