A Unifying Framework for Online Optimization with Long-Term Constraints

09/15/2022
by   Matteo Castiglioni, et al.
0

We study online learning problems in which a decision maker has to take a sequence of decisions subject to m long-term constraints. The goal of the decision maker is to maximize their total reward, while at the same time achieving small cumulative constraints violation across the T rounds. We present the first best-of-both-world type algorithm for this general class of problems, with no-regret guarantees both in the case in which rewards and constraints are selected according to an unknown stochastic model, and in the case in which they are selected at each round by an adversary. Our algorithm is the first to provide guarantees in the adversarial setting with respect to the optimal fixed strategy that satisfies the long-term constraints. In particular, it guarantees a ρ/(1+ρ) fraction of the optimal reward and sublinear regret, where ρ is a feasibility parameter related to the existence of strictly feasible solutions. Our framework employs traditional regret minimizers as black-box components. Therefore, by instantiating it with an appropriate choice of regret minimizers it can handle the full-feedback as well as the bandit-feedback setting. Moreover, it allows the decision maker to seamlessly handle scenarios with non-convex rewards and constraints. We show how our framework can be applied in the context of budget-management mechanisms for repeated auctions in order to guarantee long-term constraints that are not packing (e.g., ROI constraints).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2022

Online Learning with Knapsacks: the Best of Both Worlds

We study online learning problems in which a decision maker wants to max...
research
02/17/2016

Online optimization and regret guarantees for non-additive long-term constraints

We consider online optimization in the 1-lookahead setting, where the ob...
research
04/27/2023

A Best-of-Both-Worlds Algorithm for Constrained MDPs with Long-Term Constraints

We study online learning in episodic constrained Markov decision process...
research
11/08/2022

A Simple Algorithm for Online Decision Making

Motivated by recent progress on online linear programming (OLP), we stud...
research
07/10/2023

Online Ad Procurement in Non-stationary Autobidding Worlds

Today's online advertisers procure digital ad impressions through intera...
research
11/27/2022

Rectified Pessimistic-Optimistic Learning for Stochastic Continuum-armed Bandit with Constraints

This paper studies the problem of stochastic continuum-armed bandit with...
research
02/02/2023

Constrained Online Two-stage Stochastic Optimization: New Algorithms via Adversarial Learning

We consider an online two-stage stochastic optimization with long-term c...

Please sign up or login with your details

Forgot password? Click here to reset