Hindsight policy gradients

11/16/2017
by   Paulo Rauber, et al.
0

Goal-conditional policies allow reinforcement learning agents to pursue specific goals during different episodes. In addition to their potential to generalize desired behavior to unseen goals, such policies may also help in defining options for arbitrary subgoals, enabling higher-level planning. While trying to achieve a specific goal, an agent may also be able to exploit information about the degree to which it has achieved alternative goals. Reinforcement learning agents have only recently been endowed with such capacity for hindsight, which is highly valuable in environments with sparse rewards. In this paper, we show how hindsight can be introduced to likelihood-ratio policy gradient methods, generalizing this capacity to an entire class of highly successful algorithms. Our preliminary experiments suggest that hindsight may increase the sample efficiency of policy gradient methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2019

Maximum Entropy-Regularized Multi-Goal Reinforcement Learning

In Multi-Goal Reinforcement Learning, an agent learns to achieve multipl...
research
11/12/2019

On Policy Gradients

The goal of policy gradient approaches is to find a policy in a given cl...
research
06/08/2020

A Decentralized Policy Gradient Approach to Multi-task Reinforcement Learning

We develop a mathematical framework for solving multi-task reinforcement...
research
01/11/2021

Independent Policy Gradient Methods for Competitive Reinforcement Learning

We obtain global, non-asymptotic convergence guarantees for independent ...
research
01/28/2022

Leveraging class abstraction for commonsense reinforcement learning via residual policy gradient methods

Enabling reinforcement learning (RL) agents to leverage a knowledge base...
research
07/07/2022

Hyper-Universal Policy Approximation: Learning to Generate Actions from a Single Image using Hypernets

Inspired by Gibson's notion of object affordances in human vision, we as...
research
06/18/2019

Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination

A key challenge for Multiagent RL (Reinforcement Learning) is the design...

Please sign up or login with your details

Forgot password? Click here to reset