Rate-matching the regret lower-bound in the linear quadratic regulator with unknown dynamics

02/11/2022
by   Feicheng Wang, et al.
0

The theory of reinforcement learning currently suffers from a mismatch between its empirical performance and the theoretical characterization of its performance, with consequences for, e.g., the understanding of sample efficiency, safety, and robustness. The linear quadratic regulator with unknown dynamics is a fundamental reinforcement learning setting with significant structure in its dynamics and cost function, yet even in this setting there is a gap between the best known regret lower-bound of Ω_p(√(T)) and the best known upper-bound of O_p(√(T) polylog(T)). The contribution of this paper is to close that gap by establishing a novel regret upper-bound of O_p(√(T)). Our proof is constructive in that it analyzes the regret of a concrete algorithm, and simultaneously establishes an estimation error bound on the dynamics of O_p(T^-1/4) which is also the first to match the rate of a known lower-bound. The two keys to our improved proof technique are (1) a more precise upper- and lower-bound on the system Gram matrix and (2) a self-bounding argument for the expected estimation error of the optimal controller.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2021

Learning Stochastic Shortest Path with Linear Function Approximation

We study the stochastic shortest path (SSP) problem in reinforcement lea...
research
01/13/2023

Almost Surely √(T) Regret Bound for Adaptive LQR

The Linear-Quadratic Regulation (LQR) problem with unknown system parame...
research
08/09/2016

On Lower Bounds for Regret in Reinforcement Learning

This is a brief technical note to clarify the state of lower bounds on r...
research
06/23/2022

Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation

We study reinforcement learning with linear function approximation where...
research
05/03/2021

Pattern Complexity of Aperiodic Substitutive Subshifts

This paper aims to better understand the link better understand the link...
research
06/26/2019

A Tractable Algorithm For Finite-Horizon Continuous Reinforcement Learning

We consider the finite horizon continuous reinforcement learning problem...
research
02/25/2023

Exponential Hardness of Reinforcement Learning with Linear Function Approximation

A fundamental question in reinforcement learning theory is: suppose the ...

Please sign up or login with your details

Forgot password? Click here to reset