DeepAI AI Chat
Log In Sign Up

Adversarially Guided Subgoal Generation for Hierarchical Reinforcement Learning

by   Vivienne Huiling Wang, et al.

Hierarchical reinforcement learning (HRL) proposes to solve difficult tasks by performing decision-making and control at successively higher levels of temporal abstraction. However, off-policy training in HRL often suffers from the problem of non-stationary high-level decision making since the low-level policy is constantly changing. In this paper, we propose a novel HRL approach for mitigating the non-stationarity by adversarially enforcing the high-level policy to generate subgoals compatible with the current instantiation of the low-level policy. In practice, the adversarial learning can be implemented by training a simple discriminator network concurrently with the high-level policy which determines the compatibility level of subgoals. Experiments with state-of-the-art algorithms show that our approach significantly improves learning efficiency and overall performance of HRL in various challenging continuous control tasks.


page 4

page 6


SHIRO: Soft Hierarchical Reinforcement Learning

Hierarchical Reinforcement Learning (HRL) algorithms have been demonstra...

DHRL: A Graph-Based Approach for Long-Horizon and Sparse Hierarchical Reinforcement Learning

Hierarchical Reinforcement Learning (HRL) has made notable progress in c...

Temporal-adaptive Hierarchical Reinforcement Learning

Hierarchical reinforcement learning (HRL) helps address large-scale and ...

Hierarchical control and learning of a foraging CyberOctopus

Inspired by the unique neurophysiology of the octopus, we propose a hier...

Generating Adjacency-Constrained Subgoals in Hierarchical Reinforcement Learning

Goal-conditioned hierarchical reinforcement learning (HRL) is a promisin...

Adjacency constraint for efficient hierarchical reinforcement learning

Goal-conditioned Hierarchical Reinforcement Learning (HRL) is a promisin...

Learning When to Drive in Intersections by Combining Reinforcement Learning and Model Predictive Control

In this paper, we propose a decision making algorithm intended for autom...