
A Note on the Linear Convergence of Policy Gradient Methods
We revisit the finite time analysis of policy gradient methods in the si...
read it

Global Optimality Guarantees For Policy Gradient Methods
Policy gradients methods are perhaps the most widely used class of reinf...
read it

A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
Temporal difference learning (TD) is a simple iterative algorithm used t...
read it
Jalaj Bhandari
is this you? claim profile