Linear Convergence of Entropy-Regularized Natural Policy Gradient with Linear Function Approximation

06/08/2021
by   Semih Cayci, et al.
0

Natural policy gradient (NPG) methods with function approximation achieve impressive empirical success in reinforcement learning problems with large state-action spaces. However, theoretical understanding of their convergence behaviors remains limited in the function approximation setting. In this paper, we perform a finite-time analysis of NPG with linear function approximation and softmax parameterization, and prove for the first time that widely used entropy regularization method, which encourages exploration, leads to linear convergence rate. We adopt a Lyapunov drift analysis to prove the convergence results and explain the effectiveness of entropy regularization in improving the convergence rates.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2020

On the Global Convergence Rates of Softmax Policy Gradient Methods

We make three contributions toward better understanding policy gradient ...
research
10/19/2021

Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization

Entropy regularization is an efficient technique for encouraging explora...
research
02/11/2022

Regularized Q-learning

Q-learning is widely used algorithm in reinforcement learning community....
research
01/18/2022

Convergence of policy gradient for entropy regularized MDPs with neural network approximation in the mean-field regime

We study the global convergence of policy gradient for infinite-horizon,...
research
11/30/2022

Policy Optimization over General State and Action Spaces

Reinforcement learning (RL) problems over general state and action space...
research
07/21/2020

A Note on the Linear Convergence of Policy Gradient Methods

We revisit the finite time analysis of policy gradient methods in the si...
research
04/11/2018

Derivative free optimization via repeated classification

We develop an algorithm for minimizing a function using n batched functi...

Please sign up or login with your details

Forgot password? Click here to reset