On the Optimization Landscape of Dynamic Output Feedback: A Case Study for Linear Quadratic Regulator

09/12/2022
by   Jingliang Duan, et al.
0

The convergence of policy gradient algorithms in reinforcement learning hinges on the optimization landscape of the underlying optimal control problem. Theoretical insights into these algorithms can often be acquired from analyzing those of linear quadratic control. However, most of the existing literature only considers the optimization landscape for static full-state or output feedback policies (controllers). We investigate the more challenging case of dynamic output-feedback policies for linear quadratic regulation (abbreviated as dLQR), which is prevalent in practice but has a rather complicated optimization landscape. We first show how the dLQR cost varies with the coordinate transformation of the dynamic controller and then derive the optimal transformation for a given observable stabilizing controller. At the core of our results is the uniqueness of the stationary point of dLQR when it is observable, which is in a concise form of an observer-based controller with the optimal similarity transformation. These results shed light on designing efficient algorithms for general decision-making problems with partially observed information.

READ FULL TEXT
research
11/24/2020

Policy Optimization for Markovian Jump Linear Quadratic Control: Gradient-Based Methods and Global Convergence

Recently, policy optimization for control purposes has received renewed ...
research
02/10/2020

Convergence Guarantees of Policy Optimization Methods for Markovian Jump Linear Systems

Recently, policy optimization for control purposes has received renewed ...
research
10/10/2022

Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies

Gradient-based methods have been widely used for system design and optim...
research
07/07/2023

Accelerated Optimization Landscape of Linear-Quadratic Regulator

Linear-quadratic regulator (LQR) is a landmark problem in the field of o...
research
12/19/2019

Distributed Reinforcement Learning for Decentralized Linear Quadratic Control: A Derivative-Free Policy Optimization Approach

This paper considers a distributed reinforcement learning problem for de...
research
03/15/2023

Policy Gradient Converges to the Globally Optimal Policy for Nearly Linear-Quadratic Regulators

Nonlinear control systems with partial information to the decision maker...
research
07/08/2020

On Entropic Optimization and Path Integral Control

This article is motivated by the question whether it is possible to solv...

Please sign up or login with your details

Forgot password? Click here to reset