Global Convergence Using Policy Gradient Methods for Model-free Markovian Jump Linear Quadratic Control

11/30/2021
by   Santanu Rathod, et al.
0

Owing to the growth of interest in Reinforcement Learning in the last few years, gradient based policy control methods have been gaining popularity for Control problems as well. And rightly so, since gradient policy methods have the advantage of optimizing a metric of interest in an end-to-end manner, along with being relatively easy to implement without complete knowledge of the underlying system. In this paper, we study the global convergence of gradient-based policy optimization methods for quadratic control of discrete-time and model-free Markovian jump linear systems (MJLS). We surmount myriad challenges that arise because of more than one states coupled with lack of knowledge of the system dynamics and show global convergence of the policy using gradient descent and natural policy gradient methods. We also provide simulation studies to corroborate our claims.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/24/2020

Policy Optimization for Markovian Jump Linear Quadratic Control: Gradient-Based Methods and Global Convergence

Recently, policy optimization for control purposes has received renewed ...
research
01/15/2018

Global Convergence of Policy Gradient Methods for Linearized Control Problems

Direct policy gradient methods for reinforcement learning and continuous...
research
12/26/2019

Convergence and sample complexity of gradient methods for the model-free linear quadratic regulator problem

Model-free reinforcement learning attempts to find an optimal control ac...
research
06/30/2021

Inverse Design of Grating Couplers Using the Policy Gradient Method from Reinforcement Learning

We present a proof-of-concept technique for the inverse design of electr...
research
11/15/2018

Reward-estimation variance elimination in sequential decision processes

Policy gradient methods are very attractive in reinforcement learning du...
research
08/14/2023

Dialogue for Prompting: a Policy-Gradient-Based Discrete Prompt Optimization for Few-shot Learning

Prompt-based pre-trained language models (PLMs) paradigm have succeeded ...
research
10/09/2019

Linear-Quadratic Mean-Field Reinforcement Learning: Convergence of Policy Gradient Methods

We investigate reinforcement learning for mean field control problems in...

Please sign up or login with your details

Forgot password? Click here to reset