Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently

02/19/2020
by   Asaf Cassel, et al.
4

We consider the problem of learning in Linear Quadratic Control systems whose transition parameters are initially unknown. Recent results in this setting have demonstrated efficient learning algorithms with regret growing with the square root of the number of decision steps. We present new efficient algorithms that achieve, perhaps surprisingly, regret that scales only (poly)logarithmically with the number of steps in two scenarios: when only the state transition matrix A is unknown, and when only the state-action transition matrix B is unknown and the optimal policy satisfies a certain non-degeneracy condition. On the other hand, we give a lower bound that shows that when the latter condition is violated, square root regret is unavoidable.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2021

Minimal Expected Regret in Linear Quadratic Control

We consider the problem of online learning in Linear Quadratic Control s...
research
02/17/2019

Learning Linear-Quadratic Regulators Efficiently with only √(T) Regret

We present the first computationally-efficient algorithm with O(√(T)) r...
research
11/18/2020

On Uninformative Optimal Policies in Adaptive LQR with Unknown B-Matrix

This paper presents local asymptotic minimax regret lower bounds for ada...
research
08/18/2021

Scalable regret for learning to control network-coupled subsystems with unknown dynamics

We consider the problem of controlling an unknown linear quadratic Gauss...
research
05/05/2019

Learning to Control in Metric Space with Optimal Regret

We study online reinforcement learning for finite-horizon deterministic ...
research
09/16/2021

Adaptive Control of Quadratic Costs in Linear Stochastic Differential Equations

We study a canonical problem in adaptive control; design and analysis of...
research
01/05/2022

Regret Lower Bounds for Learning Linear Quadratic Gaussian Systems

This paper presents local minimax regret lower bounds for adaptively con...

Please sign up or login with your details

Forgot password? Click here to reset