A Note on the Linear Convergence of Policy Gradient Methods

07/21/2020

∙

We revisit the finite time analysis of policy gradient methods in the simplest setting: finite state and action problems with a policy class consisting of all stochastic policies and with exact gradient evaluations. Some recent works have viewed these problems as instances of smooth nonlinear optimization problems, suggesting suggest small stepsizes and showing sublinear convergence rates. This note instead takes a policy iteration perspective and highlights that many versions of policy gradient succeed with extremely large stepsizes and attain a linear rate of convergence.

READ FULL TEXT

A Note on the Linear Convergence of Policy Gradient Methods

Sign in with Google

Consider DeepAI Pro