Online Learning with Knapsacks: the Best of Both Worlds

02/28/2022
by   Matteo Castiglioni, et al.
0

We study online learning problems in which a decision maker wants to maximize their expected reward without violating a finite set of m resource constraints. By casting the learning process over a suitably defined space of strategy mixtures, we recover strong duality on a Lagrangian relaxation of the underlying optimization problem, even for general settings with non-convex reward and resource-consumption functions. Then, we provide the first best-of-both-worlds type framework for this setting, with no-regret guarantees both under stochastic and adversarial inputs. Our framework yields the same regret guarantees of prior work in the stochastic case. On the other hand, when budgets grow at least linearly in the time horizon, it allows us to provide a constant competitive ratio in the adversarial case, which improves over the O(m log T) competitive ratio of Immorlica at al. (2019). Moreover, our framework allows the decision maker to handle non-convex reward and cost functions. We provide two game-theoretic applications of our framework to give further evidence of its flexibility.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2022

A Unifying Framework for Online Optimization with Long-Term Constraints

We study online learning problems in which a decision maker has to take ...
research
06/14/2023

Bandits with Replenishable Knapsacks: the Best of both Worlds

The bandits with knapsack (BwK) framework models online decision-making ...
research
10/14/2020

Online Learning with Vector Costs and Bandits with Knapsacks

We introduce online learning with vector costs () where in each time ste...
research
10/11/2018

Inventory Balancing with Online Learning

We study a general problem of allocating limited resources to heterogene...
research
02/09/2022

Online Learning to Transport via the Minimal Selection Principle

Motivated by robust dynamic resource allocation in operations research, ...
research
11/18/2020

The Best of Many Worlds: Dual Mirror Descent for Online Allocation Problems

Online allocation problems with resource constraints are central problem...
research
09/24/2022

Non-monotonic Resource Utilization in the Bandits with Knapsacks Problem

Bandits with knapsacks (BwK) is an influential model of sequential decis...

Please sign up or login with your details

Forgot password? Click here to reset