On Policy Gradients

11/12/2019
by   Mattis Manfred Kämmerer, et al.
0

The goal of policy gradient approaches is to find a policy in a given class of policies which maximizes the expected return. Given a differentiable model of the policy, we want to apply a gradient-ascent technique to reach a local optimum. We mainly use gradient ascent, because it is theoretically well researched. The main issue is that the policy gradient with respect to the expected return is not available, thus we need to estimate it. As policy gradient algorithms also tend to require on-policy data for the gradient estimate, their biggest weakness is sample efficiency. For this reason, most research is focused on finding algorithms with improved sample efficiency. This paper provides a formal introduction to policy gradient that shows the development of policy gradient approaches, and should enable the reader to follow current research on the topic.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2019

Is the Policy Gradient a Gradient?

The policy gradient theorem describes the gradient of the expected disco...
research
05/11/2023

Policy Gradient Algorithms Implicitly Optimize by Continuation

Direct policy optimization in reinforcement learning is usually solved w...
research
03/05/2018

Learning Sample-Efficient Target Reaching for Mobile Robots

In this paper, we propose a novel architecture and a self-supervised pol...
research
11/16/2017

Hindsight policy gradients

Goal-conditional policies allow reinforcement learning agents to pursue ...
research
06/13/2012

Improving Gradient Estimation by Incorporating Sensor Data

An efficient policy search algorithm should estimate the local gradient ...
research
11/03/2020

A Study of Policy Gradient on a Class of Exactly Solvable Models

Policy gradient methods are extensively used in reinforcement learning a...
research
06/30/2020

Policy Gradient Optimization of Thompson Sampling Policies

We study the use of policy gradient algorithms to optimize over a class ...

Please sign up or login with your details

Forgot password? Click here to reset