Bandits with Replenishable Knapsacks: the Best of both Worlds

06/14/2023
by   Martino Bernasconi, et al.
0

The bandits with knapsack (BwK) framework models online decision-making problems in which an agent makes a sequence of decisions subject to resource consumption constraints. The traditional model assumes that each action consumes a non-negative amount of resources and the process ends when the initial budgets are fully depleted. We study a natural generalization of the BwK framework which allows non-monotonic resource utilization, i.e., resources can be replenished by a positive amount. We propose a best-of-both-worlds primal-dual template that can handle any online learning problem with replenishment for which a suitable primal regret minimizer exists. In particular, we provide the first positive results for the case of adversarial inputs by showing that our framework guarantees a constant competitive ratio α when B=Ω(T) or when the possible per-round replenishment is a positive constant. Moreover, under a stochastic input model, our algorithm yields an instance-independent Õ(T^1/2) regret bound which complements existing instance-dependent bounds for the same setting. Finally, we provide applications of our framework to some economic problems of practical relevance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2022

Online Learning with Knapsacks: the Best of Both Worlds

We study online learning problems in which a decision maker wants to max...
research
09/24/2022

Non-monotonic Resource Utilization in the Bandits with Knapsacks Problem

Bandits with knapsacks (BwK) is an influential model of sequential decis...
research
02/01/2020

Advances in Bandits with Knapsacks

"Bandits with Knapsacks" () is a general model for multi-armed bandits u...
research
02/10/2021

An Efficient Pessimistic-Optimistic Algorithm for Constrained Linear Bandits

This paper considers stochastic linear bandits with general constraints....
research
10/11/2018

Inventory Balancing with Online Learning

We study a general problem of allocating limited resources to heterogene...
research
02/28/2023

Approximately Stationary Bandits with Knapsacks

Bandits with Knapsacks (BwK), the generalization of the Multi-Armed Band...
research
06/01/2023

Last Switch Dependent Bandits with Monotone Payoff Functions

In a recent work, Laforgue et al. introduce the model of last switch dep...

Please sign up or login with your details

Forgot password? Click here to reset