DeepAI
Log In Sign Up

Efficient Hierarchical Exploration with Stable Subgoal Representation Learning

05/31/2021
by   Siyuan Li, et al.
0

Goal-conditioned hierarchical reinforcement learning (HRL) serves as a successful approach to solving complex and temporally extended tasks. Recently, its success has been extended to more general settings by concurrently learning hierarchical policies and subgoal representations. However, online subgoal representation learning exacerbates the non-stationary issue of HRL and introduces challenges for exploration in high-level policy learning. In this paper, we propose a state-specific regularization that stabilizes subgoal embeddings in well-explored areas while allowing representation updates in less explored state regions. Benefiting from this stable representation, we design measures of novelty and potential for subgoals, and develop an efficient hierarchical exploration strategy that actively seeks out new promising subgoals and states. Experimental results show that our method significantly outperforms state-of-the-art baselines in continuous control tasks with sparse rewards and further demonstrate the stability and efficiency of the subgoal representation learning of this work, which promotes superior policy learning.

READ FULL TEXT
06/18/2019

Directed Exploration for Reinforcement Learning

Efficient exploration is necessary to achieve good sample efficiency for...
10/02/2018

Near-Optimal Representation Learning for Hierarchical Reinforcement Learning

We study the problem of representation learning in goal-conditioned hier...
10/26/2021

Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning

Goal-conditioned hierarchical reinforcement learning (HRL) has shown pro...
11/19/2018

Learning Actionable Representations with Goal-Conditioned Policies

Representation learning is a central challenge across a range of machine...
11/30/2018

Modulated Policy Hierarchies

Solving tasks with sparse rewards is a main challenge in reinforcement l...
09/28/2021

Exploratory State Representation Learning

Not having access to compact and meaningful representations is known to ...
01/21/2023

Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction

We study the problem of learning goal-conditioned policies in Minecraft,...