Approximately Stationary Bandits with Knapsacks

02/28/2023
by   Giannis Fikioris, et al.
0

Bandits with Knapsacks (BwK), the generalization of the Multi-Armed Bandits under budget constraints, has received a lot of attention in recent years. It has numerous applications, including dynamic pricing, repeated auctions, etc. Previous work has focused on one of the two extremes: Stochastic BwK where the rewards and consumptions of the resources each round are sampled from an i.i.d. distribution, and Adversarial BwK where these values are picked by an adversary. Achievable guarantees in the two cases exhibit a massive gap: No-regret learning is achievable in Stochastic BwK, but in Adversarial BwK, only competitive ratio style guarantees are achievable, where the competitive ratio depends on the budget. What makes this gap so vast is that in Adversarial BwK the guarantees get worse in the typical case when the budget is more binding. While “best-of-both-worlds” type algorithms are known (algorithms that provide the best achievable guarantee in both extreme cases), their guarantees degrade to the adversarial case as soon as the environment is not fully stochastic. Our work aims to bridge this gap, offering guarantees for a workload that is not exactly stochastic but is also not worst-case. We define a condition, Approximately Stationary BwK, that parameterizes how close to stochastic or adversarial an instance is. Based on these parameters, we explore what is the best competitive ratio attainable in BwK. We explore two algorithms that are oblivious to the values of the parameters but guarantee competitive ratios that smoothly transition between the best possible guarantees in the two extreme cases, depending on the values of the parameters. Our guarantees offer great improvement over the adversarial guarantee, especially when the available budget is small. We also prove bounds on the achievable guarantee, showing that our results are approximately tight when the budget is small.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2018

Adversarial Bandits with Knapsacks

We consider Bandits with Knapsacks (henceforth, BwK), a general model fo...
research
10/06/2018

Learning to Optimize under Non-Stationarity

We introduce algorithms that achieve state-of-the-art dynamic regret bou...
research
02/22/2019

Better Algorithms for Stochastic Bandits with Adversarial Corruptions

We study the stochastic multi-armed bandits problem in the presence of a...
research
02/20/2023

A Blackbox Approach to Best of Both Worlds in Bandits and Beyond

Best-of-both-worlds algorithms for online learning which achieve near-op...
research
06/14/2023

Bandits with Replenishable Knapsacks: the Best of both Worlds

The bandits with knapsack (BwK) framework models online decision-making ...
research
03/25/2018

Stochastic bandits robust to adversarial corruptions

We introduce a new model of stochastic bandits with adversarial corrupti...
research
08/18/2023

Greedy-Based Online Fair Allocation with Adversarial Input: Enabling Best-of-Many-Worlds Guarantees

We study an online allocation problem with sequentially arriving items a...

Please sign up or login with your details

Forgot password? Click here to reset