Convergence Guarantees of Policy Optimization Methods for Markovian Jump Linear Systems

02/10/2020
by   Joao Paulo Jansch-Porto, et al.
0

Recently, policy optimization for control purposes has received renewed attention due to the increasing interest in reinforcement learning. In this paper, we investigate the convergence of policy optimization for quadratic control of Markovian jump linear systems (MJLS). First, we study the optimization landscape of direct policy optimization for MJLS, and, in particular, show that despite the non-convexity of the resultant problem the unique stationary point is the global optimal solution. Next, we prove that the Gauss-Newton method and the natural policy gradient method converge to the optimal state feedback controller for MJLS at a linear rate if initialized at a controller which stabilizes the closed-loop dynamics in the mean square sense. We propose a novel Lyapunov argument to fix a key stability issue in the convergence proof. Finally, we present a numerical example to support our theory. Our work brings new insights for understanding the performance of policy learning methods on controlling unknown MJLS.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/24/2020

Policy Optimization for Markovian Jump Linear Quadratic Control: Gradient-Based Methods and Global Convergence

Recently, policy optimization for control purposes has received renewed ...
research
09/12/2022

On the Optimization Landscape of Dynamic Output Feedback: A Case Study for Linear Quadratic Regulator

The convergence of policy gradient algorithms in reinforcement learning ...
research
11/20/2020

Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon

We explore reinforcement learning methods for finding the optimal policy...
research
09/12/2023

Convergence of Gradient-based MAML in LQR

The main objective of this research paper is to investigate the local co...
research
11/15/2019

A System Theoretical Perspective to Gradient-Tracking Algorithms for Distributed Quadratic Optimization

In this paper we consider a recently developed distributed optimization ...
research
05/29/2020

Online Regulation of Unstable LTI Systems from a Single Trajectory

Recently, data-driven methods for control of dynamic systems have receiv...

Please sign up or login with your details

Forgot password? Click here to reset