MapGo: Model-Assisted Policy Optimization for Goal-Oriented Tasks

05/13/2021
by   Menghui Zhu, et al.
0

In Goal-oriented Reinforcement learning, relabeling the raw goals in past experience to provide agents with hindsight ability is a major solution to the reward sparsity problem. In this paper, to enhance the diversity of relabeled goals, we develop FGI (Foresight Goal Inference), a new relabeling strategy that relabels the goals by looking into the future with a learned dynamics model. Besides, to improve sample efficiency, we propose to use the dynamics model to generate simulated trajectories for policy training. By integrating these two improvements, we introduce the MapGo framework (Model-Assisted Policy Optimization for Goal-oriented tasks). In our experiments, we first show the effectiveness of the FGI strategy compared with the hindsight one, and then show that the MapGo framework achieves higher sample efficiency when compared to model-free baselines on a set of complicated tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/25/2018

Floyd-Warshall Reinforcement Learning Learning from Past Experiences to Reach New Goals

Consider mutli-goal tasks that involve static environments and dynamic g...
research
06/01/2020

PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals

Learning with sparse rewards remains a significant challenge in reinforc...
research
06/28/2023

RoMo-HER: Robust Model-based Hindsight Experience Replay

Sparse rewards are one of the factors leading to low sample efficiency i...
research
06/10/2019

Exploration via Hindsight Goal Generation

Goal-oriented reinforcement learning has recently been a practical frame...
research
07/06/2020

Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning

What goals should a multi-goal reinforcement learning agent pursue durin...
research
05/04/2019

Hierarchical Policy Learning is Sensitive to Goal Space Design

Hierarchy in reinforcement learning agents allows for control at multipl...
research
07/12/2022

DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization

Recent algorithms designed for reinforcement learning tasks focus on fin...

Please sign up or login with your details

Forgot password? Click here to reset