A Regret Minimization Approach to Iterative Learning Control

by   Naman Agarwal, et al.

We consider the setting of iterative learning control, or model-based policy learning in the presence of uncertain, time-varying dynamics. In this setting, we propose a new performance metric, planning regret, which replaces the standard stochastic uncertainty assumptions with worst case regret. Based on recent advances in non-stochastic control, we design a new iterative algorithm for minimizing planning regret that is more robust to model mismatch and uncertainty. We provide theoretical and empirical evidence that the proposed algorithm outperforms existing methods on several benchmarks.


page 1

page 2

page 3

page 4


Regret-optimal Estimation and Control

We consider estimation and control in linear time-varying dynamical syst...

Adaptive Regret for Control of Time-Varying Dynamics

We consider regret minimization for online control with time-varying lin...

Learning to Control under Time-Varying Environment

This paper investigates the problem of regret minimization in linear tim...

Safe Policy Improvement by Minimizing Robust Baseline Regret

An important problem in sequential decision-making under uncertainty is ...

To Explore or Not to Explore: Regret-Based LTL Planning in Partially-Known Environments

In this paper, we investigate the optimal robot path planning problem fo...

Scalable regret for learning to control network-coupled subsystems with unknown dynamics

We consider the problem of controlling an unknown linear quadratic Gauss...

Differentiable Robust LQR Layers

This paper proposes a differentiable robust LQR layer for reinforcement ...