Planning Goals for Exploration

03/23/2023
by   Edward S. Hu, et al.
0

Dropped into an unknown environment, what should an agent do to quickly learn about the environment and how to accomplish diverse tasks within it? We address this question within the goal-conditioned reinforcement learning paradigm, by identifying how the agent should set its goals at training time to maximize exploration. We propose "Planning Exploratory Goals" (PEG), a method that sets goals for each training episode to directly optimize an intrinsic exploration reward. PEG first chooses goal commands such that the agent's goal-conditioned policy, at its current level of training, will end up in states with high exploration potential. It then launches an exploration policy starting at those promising states. To enable this direct optimization, PEG learns world models and adapts sampling-based planning algorithms to "plan goal commands". In challenging simulated robotics environments including a multi-legged ant robot in a maze, and a robot arm on a cluttered tabletop, PEG exploration enables more efficient and effective training of goal-conditioned policies relative to baselines and ablations. Our ant successfully navigates a long maze, and the robot arm successfully builds a stack of three blocks upon command. Website: https://penn-pal-lab.github.io/peg/

READ FULL TEXT

page 6

page 8

research
10/28/2022

Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement Learning

Reinforcement learning (RL) often struggles to accomplish a sparse-rewar...
research
11/18/2021

Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning

Operating in the real-world often requires agents to learn about a compl...
research
05/21/2020

LEAF: Latent Exploration Along the Frontier

Self-supervised goal proposal and reaching is a key component for explor...
research
03/26/2023

Learning Generative Models with Goal-conditioned Reinforcement Learning

We present a novel, alternative framework for learning generative models...
research
01/30/2019

InfoBot: Transfer and Exploration via the Information Bottleneck

A central challenge in reinforcement learning is discovering effective p...
research
01/22/2020

GLIB: Exploration via Goal-Literal Babbling for Lifted Operator Learning

We address the problem of efficient exploration for learning lifted oper...
research
05/29/2019

Learning Navigation Subroutines by Watching Videos

Hierarchies are an effective way to boost sample efficiency in reinforce...

Please sign up or login with your details

Forgot password? Click here to reset