Robust exploration in linear quadratic reinforcement learning

06/04/2019
by   Jack Umenberger, et al.
0

This paper concerns the problem of learning control policies for an unknown linear dynamical system to minimize a quadratic cost function. We present a method, based on convex optimization, that accomplishes this task robustly: i.e., we minimize the worst-case cost, accounting for system uncertainty given the observed data. The method balances exploitation and exploration, exciting the system in such a way so as to reduce uncertainty in the model parameters to which the worst-case cost is most sensitive. Numerical simulations and application to a hardware-in-the-loop servo-mechanism demonstrate the approach, with appreciable performance and robustness gains over alternative methods observed in both.

READ FULL TEXT
research
06/01/2018

Learning convex bounds for linear quadratic control policy synthesis

Learning to make decisions from observed data in dynamic environments re...
research
06/10/2021

Differentiable Robust LQR Layers

This paper proposes a differentiable robust LQR layer for reinforcement ...
research
10/21/2020

Worst-case sensitivity

We introduce the notion of Worst-Case Sensitivity, defined as the worst-...
research
12/31/2019

Optimistic robust linear quadratic dual control

Recent work by Mania et al. has proved that certainty equivalent control...
research
03/28/2023

Worst case tractability of linear problems in the presence of noise: linear information

We study the worst case tractability of multivariate linear problems def...
research
05/22/2019

Learning Robust Options by Conditional Value at Risk Optimization

Options are generally learned by using an inaccurate environment model (...
research
03/28/2023

Worst-Case Control and Learning Using Partial Observations Over an Infinite Time-Horizon

Safety-critical cyber-physical systems require control strategies whose ...

Please sign up or login with your details

Forgot password? Click here to reset