Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs

07/22/2023
by   Qingyang Zhang, et al.
0

Goal-Conditioned Hierarchical Reinforcement Learning (GCHRL) is a promising paradigm to address the exploration-exploitation dilemma in reinforcement learning. It decomposes the source task into subgoal conditional subtasks and conducts exploration and exploitation in the subgoal space. The effectiveness of GCHRL heavily relies on subgoal representation functions and subgoal selection strategy. However, existing works often overlook the temporal coherence in GCHRL when learning latent subgoal representations and lack an efficient subgoal selection strategy that balances exploration and exploitation. This paper proposes HIerarchical reinforcement learning via dynamically building Latent Landmark graphs (HILL) to overcome these limitations. HILL learns latent subgoal representations that satisfy temporal coherence using a contrastive representation learning objective. Based on these representations, HILL dynamically builds latent landmark graphs and employs a novelty measure on nodes and a utility measure on edges. Finally, HILL develops a subgoal selection strategy that balances exploration and exploitation by jointly considering both measures. Experimental results demonstrate that HILL outperforms state-of-the-art baselines on continuous control tasks with sparse rewards in sample efficiency and asymptotic performance. Our code is available at https://github.com/papercode2022/HILL.

READ FULL TEXT

page 1

page 5

research
05/31/2021

Efficient Hierarchical Exploration with Stable Subgoal Representation Learning

Goal-conditioned hierarchical reinforcement learning (HRL) serves as a s...
research
10/26/2021

Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning

Goal-conditioned hierarchical reinforcement learning (HRL) has shown pro...
research
10/24/2022

MEET: A Monte Carlo Exploration-Exploitation Trade-off for Buffer Sampling

Data selection is essential for any data-based optimization technique, s...
research
05/31/2023

Representation-Driven Reinforcement Learning

We present a representation-driven framework for reinforcement learning....
research
08/21/2023

Diffusion Model as Representation Learner

Diffusion Probabilistic Models (DPMs) have recently demonstrated impress...
research
08/31/2022

Cell-Free Latent Go-Explore

In this paper, we introduce Latent Go-Explore (LGE), a simple and genera...
research
09/23/2022

Multi-Agent Exploration of an Unknown Sparse Landmark Complex via Deep Reinforcement Learning

In recent years Landmark Complexes have been successfully employed for l...

Please sign up or login with your details

Forgot password? Click here to reset