Online Convex Optimization with Unbounded Memory

10/18/2022
by   Raunak Kumar, et al.
0

Online convex optimization (OCO) is a widely used framework in online learning. In each round, the learner chooses a decision in some convex set and an adversary chooses a convex loss function, and then the learner suffers the loss associated with their chosen decision. However, in many of the motivating applications the loss of the learner depends not only on the current decision but on the entire history of decisions until that point. The OCO framework and existing generalizations thereof fail to capture this. In this work we introduce a generalization of the OCO framework, “Online Convex Optimization with Unbounded Memory”, that captures long-term dependence on past decisions. We introduce the notion of p-effective memory capacity, H_p, that quantifies the maximum influence of past decisions on current losses. We prove a O(√(H_1 T)) policy regret bound and a stronger O(√(H_p T)) policy regret bound under mild additional assumptions. These bounds are optimal in terms of their dependence on the time horizon T. We show the broad applicability of our framework by using it to derive regret bounds, and to simplify existing regret bound derivations, for a variety of online learning problems including an online variant of performative prediction and online linear control.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/16/2020

Online non-convex optimization with imperfect feedback

We consider the problem of online learning with non-convex losses. In te...
research
08/24/2020

Online Convex Optimization Perspective for Learning from Dynamically Revealed Preferences

We study the problem of online learning (OL) from revealed preferences: ...
research
08/09/2021

Online Multiobjective Minimax Optimization and Applications

We introduce a simple but general online learning framework, in which at...
research
12/03/2020

Online learning with dynamics: A minimax perspective

We study the problem of online learning with dynamics, where a learner i...
research
09/13/2021

Zeroth-order non-convex learning via hierarchical dual averaging

We propose a hierarchical version of dual averaging for zeroth-order onl...
research
03/02/2020

Online Agnostic Boosting via Regret Minimization

Boosting is a widely used machine learning approach based on the idea of...
research
11/25/2020

Leveraging Predictions in Smoothed Online Convex Optimization via Gradient-based Algorithms

We consider online convex optimization with time-varying stage costs and...

Please sign up or login with your details

Forgot password? Click here to reset