Learning Achievement Structure for Structured Exploration in Domains with Sparse Reward

04/30/2023
by   Zihan Zhou, et al.
0

We propose Structured Exploration with Achievements (SEA), a multi-stage reinforcement learning algorithm designed for achievement-based environments, a particular type of environment with an internal achievement set. SEA first uses offline data to learn a representation of the known achievements with a determinant loss function, then recovers the dependency graph of the learned achievements with a heuristic algorithm, and finally interacts with the environment online to learn policies that master known achievements and explore new ones with a controller built with the recovered dependency graph. We empirically demonstrate that SEA can recover the achievement structure accurately and improve exploration in hard domains such as Crafter that are procedurally generated with high-dimensional observations like images.

READ FULL TEXT

page 3

page 7

research
11/02/2021

Discovering and Exploiting Sparse Rewards in a Learned Behavior Space

Learning optimal policies in sparse rewards settings is difficult as the...
research
12/06/2020

Neural Online Graph Exploration

Can we learn how to explore unknown spaces efficiently? To answer this q...
research
05/13/2019

Learning and Exploiting Multiple Subgoals for Fast Exploration in Hierarchical Reinforcement Learning

Hierarchical Reinforcement Learning (HRL) exploits temporally extended a...
research
08/06/2020

Explore then Execute: Adapting without Rewards via Factorized Meta-Reinforcement Learning

We seek to efficiently learn by leveraging shared structure between diff...
research
10/28/2019

Learning Transferable Graph Exploration

This paper considers the problem of efficient exploration of unseen envi...
research
03/14/2016

Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains

High-dimensional observations and complex real-world dynamics present ma...
research
09/05/2017

Active Exploration for Learning Symbolic Representations

We introduce an online active exploration algorithm for data-efficiently...

Please sign up or login with your details

Forgot password? Click here to reset