Efficient Online Learning with Memory via Frank-Wolfe Optimization: Algorithms with Bounded Dynamic Regret and Applications to Control

01/02/2023
by   Hongyu Zhou, et al.
0

Projection operations are a typical computation bottleneck in online learning. In this paper, we enable projection-free online learning within the framework of Online Convex Optimization with Memory (OCO-M) – OCO-M captures how the history of decisions affects the current outcome by allowing the online learning loss functions to depend on both current and past decisions. Particularly, we introduce the first projection-free meta-base learning algorithm with memory that minimizes dynamic regret, i.e., that minimizes the suboptimality against any sequence of time-varying decisions. We are motivated by artificial intelligence applications where autonomous agents need to adapt to time-varying environments in real-time, accounting for how past decisions affect the present. Examples of such applications are: online control of dynamical systems; statistical arbitrage; and time series prediction. The algorithm builds on the Online Frank-Wolfe (OFW) and Hedge algorithms. We demonstrate how our algorithm can be applied to the online control of linear time-varying systems in the presence of unpredictable process noise. To this end, we develop the first controller with memory and bounded dynamic regret against any optimal time-varying linear feedback control policy. We validate our algorithm in simulated scenarios of online control of linear time-invariant systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/02/2021

Strongly Adaptive OCO with Memory

Recent progress in online control has popularized online learning with m...
research
11/14/2022

Implications of Regret on Stability of Linear Dynamical Systems

The setting of an agent making decisions under uncertainty and under dyn...
research
04/29/2021

Stable Online Control of LTV Systems Stable Online Control of Linear Time-Varying Systems

Linear time-varying (LTV) systems are widely used for modeling real-worl...
research
06/06/2022

Learning to Control under Time-Varying Environment

This paper investigates the problem of regret minimization in linear tim...
research
01/26/2023

Smoothed Online Learning for Prediction in Piecewise Affine Systems

The problem of piecewise affine (PWA) regression and planning is of foun...
research
07/23/2013

Online Optimization in Dynamic Environments

High-velocity streams of high-dimensional data pose significant "big dat...
research
10/17/2019

Optimization and Learning with Information Streams: Time-varying Algorithms and Applications

There is a growing cross-disciplinary effort in the broad domain of opti...

Please sign up or login with your details

Forgot password? Click here to reset