Visual Hindsight Experience Replay

01/31/2019
by   Himanshu Sahni, et al.
2

Reinforcement Learning algorithms typically require millions of environment interactions to learn successful policies in sparse reward settings. Hindsight Experience Replay (HER) was introduced as a technique to increase sample efficiency through re-imagining unsuccessful trajectories as successful ones by replacing the originally intended goals. However, this method is not applicable to visual domains where the goal configuration is unknown and must be inferred from observation. In this work, we show how unsuccessful visual trajectories can be hallucinated to be successful using a generative model trained on relatively few snapshots of the goal. As far as we are aware, this is the first work that does so with the agent policy conditioned solely on its state. We then apply this model to training reinforcement learning agents in discrete and continuous settings. We show results on a navigation and pick-and-place task in a 3D environment and on a simulated robotics application. Our method shows marked improvement over standard RL algorithms and baselines derived from prior work.

READ FULL TEXT

page 1

page 4

page 7

page 8

research
08/28/2020

Sample Efficiency in Sparse Reinforcement Learning: Or Your Money Back

Sparse rewards present a difficult problem in reinforcement learning and...
research
05/21/2019

Maximum Entropy-Regularized Multi-Goal Reinforcement Learning

In Multi-Goal Reinforcement Learning, an agent learns to achieve multipl...
research
09/16/2018

Improvements on Hindsight Learning

Sparse reward problems are one of the biggest challenges in Reinforcemen...
research
02/12/2019

ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning

Sparse reward is one of the most challenging problems in reinforcement l...
research
02/20/2019

Curiosity-Driven Experience Prioritization via Density Estimation

In Reinforcement Learning (RL), an agent explores the environment and co...
research
01/11/2020

Reward Engineering for Object Pick and Place Training

Robotic grasping is a crucial area of research as it can result in the a...
research
08/20/2018

Learning to Dialogue via Complex Hindsight Experience Replay

Reinforcement learning methods have been used for learning dialogue poli...

Please sign up or login with your details

Forgot password? Click here to reset