A Note on the Linear Convergence of Policy Gradient Methods

07/21/2020
by   Jalaj Bhandari, et al.
0

We revisit the finite time analysis of policy gradient methods in the simplest setting: finite state and action problems with a policy class consisting of all stochastic policies and with exact gradient evaluations. Some recent works have viewed these problems as instances of smooth nonlinear optimization problems, suggesting suggest small stepsizes and showing sublinear convergence rates. This note instead takes a policy iteration perspective and highlights that many versions of policy gradient succeed with extremely large stepsizes and attain a linear rate of convergence.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2021

Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings

Policy gradient methods have been frequently applied to problems in cont...
research
10/04/2022

Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies

We consider infinite-horizon discounted Markov decision processes and st...
research
06/30/2020

Policy Gradient Optimization of Thompson Sampling Policies

We study the use of policy gradient algorithms to optimize over a class ...
research
06/08/2021

Linear Convergence of Entropy-Regularized Natural Policy Gradient with Linear Function Approximation

Natural policy gradient (NPG) methods with function approximation achiev...
research
06/14/2022

How are policy gradient methods affected by the limits of control?

We study stochastic policy gradient methods from the perspective of cont...
research
05/17/2022

On the Convergence of Policy in Unregularized Policy Mirror Descent

In this short note, we give the convergence analysis of the policy in th...
research
10/22/2020

Sample Efficient Reinforcement Learning with REINFORCE

Policy gradient methods are among the most effective methods for large-s...

Please sign up or login with your details

Forgot password? Click here to reset