Advances in Bandits with Knapsacks

"Bandits with Knapsacks" () is a general model for multi-armed bandits under supply/budget constraints. While worst-case regret bounds for are well-understood, we focus on logarithmic instance-dependent regret bounds. We largely resolve them for one limited resource other than time, and for known, deterministic resource consumption. We also bound regret within a given round ("simple regret"). One crucial technique analyzes the sum of the confidence terms of the chosen arms. This technique allows to import the insights from prior work on bandits without resources, which leads to several extensions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2019

Batched Multi-armed Bandits Problem

In this paper, we study the multi-armed bandit problem in the batched se...
research
06/15/2023

Logarithmic Bayes Regret Bounds

We derive the first finite-time logarithmic regret bounds for Bayesian b...
research
06/22/2020

Adaptive Discretization for Adversarial Bandits with Continuous Action Spaces

Lipschitz bandits is a prominent version of multi-armed bandits that stu...
research
10/14/2016

The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits

Stochastic linear bandits are a natural and simple generalisation of fin...
research
07/16/2020

Self-Tuning Bandits over Unknown Covariate-Shifts

Bandits with covariates, a.k.a. contextual bandits, address situations w...
research
06/14/2023

Bandits with Replenishable Knapsacks: the Best of both Worlds

The bandits with knapsack (BwK) framework models online decision-making ...
research
09/24/2022

Non-monotonic Resource Utilization in the Bandits with Knapsacks Problem

Bandits with knapsacks (BwK) is an influential model of sequential decis...

Please sign up or login with your details

Forgot password? Click here to reset