Global Convergence of Direct Policy Search for State-Feedback ℋ_∞ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential

10/20/2022
by   Xingang Guo, et al.
0

Direct policy search has been widely applied in modern reinforcement learning and continuous control. However, the theoretical properties of direct policy search on nonsmooth robust control synthesis have not been fully understood. The optimal ℋ_∞ control framework aims at designing a policy to minimize the closed-loop ℋ_∞ norm, and is arguably the most fundamental robust control paradigm. In this work, we show that direct policy search is guaranteed to find the global solution of the robust ℋ_∞ state-feedback control design problem. Notice that policy search for optimal ℋ_∞ control leads to a constrained nonconvex nonsmooth optimization problem, where the nonconvex feasible set consists of all the policies stabilizing the closed-loop dynamics. We show that for this nonsmooth optimization problem, all Clarke stationary points are global minimum. Next, we identify the coerciveness of the closed-loop ℋ_∞ objective function, and prove that all the sublevel sets of the resultant policy search problem are compact. Based on these properties, we show that Goldstein's subgradient method and its implementable variants can be guaranteed to stay in the nonconvex feasible set and eventually find the global optimal solution of the ℋ_∞ state-feedback synthesis problem. Our work builds a new connection between nonconvex nonsmooth optimization theory and robust control, leading to an interesting global convergence result for direct policy search on optimal ℋ_∞ synthesis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2020

Convergence Guarantees of Policy Optimization Methods for Markovian Jump Linear Systems

Recently, policy optimization for control purposes has received renewed ...
research
10/26/2019

Convergent Policy Optimization for Safe Reinforcement Learning

We study the safe reinforcement learning problem with nonlinear function...
research
10/10/2022

Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies

Gradient-based methods have been widely used for system design and optim...
research
04/06/2021

Adaptive Variants of Optimal Feedback Policies

We combine adaptive control directly with optimal or near-optimal value ...
research
09/12/2020

Guided Policy Search Based Control of a High Dimensional Advanced Manufacturing Process

In this paper we apply guided policy search (GPS) based reinforcement le...
research
11/21/2022

CONFIG: Constrained Efficient Global Optimization for Closed-Loop Control System Optimization with Unmodeled Constraints

In this paper, the CONFIG algorithm, a simple and provably efficient con...
research
10/04/2017

On the Sample Complexity of the Linear Quadratic Regulator

This paper addresses the optimal control problem known as the Linear Qua...

Please sign up or login with your details

Forgot password? Click here to reset