Certainty Equivalent Control of LQR is Efficient

by   Horia Mania, et al.

We study the performance of the certainty equivalent controller on the Linear Quadratic Regulator (LQR) with unknown transition dynamics. We show that the sub-optimality gap between the cost incurred by playing the certainty equivalent controller on the true system and the cost incurred by using the optimal LQR controller enjoys a fast statistical rate, scaling as the square of the parameter error. Our result improves upon recent work by Dean et al. (2017), who present an algorithm achieving a sub-optimality gap linear in the parameter error. A key part of our analysis relies on perturbation bounds for discrete Riccati equations. We provide two new perturbation bounds, one that expands on an existing result from Konstantinov et al. (1993), and another based on a new elementary proof strategy. Our results show that certainty equivalent control with ε-greedy exploration achieves Õ(√(T)) regret in the adaptive LQR setting, yielding the first guarantee of a computationally tractable algorithm that achieves nearly optimal regret for adaptive LQR.



There are no comments yet.


page 1

page 2

page 3

page 4


Optimistic robust linear quadratic dual control

Recent work by Mania et al. has proved that certainty equivalent control...

Certainty Equivalent Quadratic Control for Markov Jump Systems

Real-world control applications often involve complex dynamics subject t...

Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP

We introduce two new no-regret algorithms for the stochastic shortest pa...

Optimal translational-rotational invariant dictionaries for images

We provide the construction of a set of square matrices whose translates...

Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously

In this work, we develop linear bandit algorithms that automatically ada...

An Adaptive Algorithm for Finite Stochastic Partial Monitoring

We present a new anytime algorithm that achieves near-optimal regret for...

Identification and Adaptive Control of Markov Jump Systems: Sample Complexity and Regret Bounds

Learning how to effectively control unknown dynamical systems is crucial...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.