Follow the Clairvoyant: an Imitation Learning Approach to Optimal Control

11/14/2022
by   Andrea Martin, et al.
0

We consider control of dynamical systems through the lens of competitive analysis. Most prior work in this area focuses on minimizing regret, that is, the loss relative to an ideal clairvoyant policy that has noncausal access to past, present, and future disturbances. Motivated by the observation that the optimal cost only provides coarse information about the ideal closed-loop behavior, we instead propose directly minimizing the tracking error relative to the optimal trajectories in hindsight, i.e., imitating the clairvoyant policy. By embracing a system level perspective, we present an efficient optimization-based approach for computing follow-the-clairvoyant (FTC) safe controllers. We prove that these attain minimal regret if no constraints are imposed on the noncausal benchmark. In addition, we present numerical experiments to show that our policy retains the hallmark of competitive algorithms of interpolating between classical ℋ_2 and ℋ_∞ control laws - while consistently outperforming regret minimization methods in constrained scenarios thanks to the superior ability to chase the clairvoyant.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2021

Regret-optimal Estimation and Control

We consider estimation and control in linear time-varying dynamical syst...
research
10/20/2020

Regret-optimal control in dynamic environments

We consider the control of linear time-varying dynamical systems from th...
research
11/21/2022

Best of Both Worlds in Online Control: Competitive Ratio and Policy Regret

We consider the fundamental problem of online control of a linear dynami...
research
01/28/2022

A Regret Minimization Approach to Multi-Agent Control

We study the problem of multi-agent control of a dynamical system with k...
research
11/06/2018

A Dynamic Regret Analysis and Adaptive Regularization Algorithm for On-Policy Robot Imitation Learning

On-policy imitation learning algorithms such as Dagger evolve a robot co...
research
07/29/2022

Improved Policy Optimization for Online Imitation Learning

We consider online imitation learning (OIL), where the task is to find a...
research
07/13/2016

Safe Policy Improvement by Minimizing Robust Baseline Regret

An important problem in sequential decision-making under uncertainty is ...

Please sign up or login with your details

Forgot password? Click here to reset