Hierarchical Reinforcement Learning via Advantage-Weighted Information Maximization

01/05/2019
by   Takayuki Osa, et al.
0

Real-world tasks are often highly structured. Hierarchical reinforcement learning (HRL) has attracted research interest as an approach for leveraging the hierarchical structure of a given task in reinforcement learning (RL). However, identifying the hierarchical policy structure that enhances the performance of RL is not a trivial task. In this paper, we propose an HRL method that learns a latent variable of a hierarchical policy using mutual information maximization. Our approach can be interpreted as a way to learn a discrete and latent representation of the state-action space. To learn option policies that correspond to modes of the advantage function, we introduce advantage-weighted importance sampling. In our HRL method, the gating policy learns to select option policies based on an option-value function, and these option policies are optimized based on the deterministic policy gradient method. This framework is derived by leveraging the analogy between a monolithic policy in standard RL and a hierarchical policy in HRL by using a deterministic option policy. Experimental results indicate that our HRL approach can learn a diversity of options and that it can enhance the performance of RL in continuous control tasks.

READ FULL TEXT
research
11/28/2017

Hierarchical Policy Search via Return-Weighted Density Estimation

Learning an optimal policy from a multi-modal reward function is a chall...
research
12/06/2021

Flexible Option Learning

Temporal abstraction in reinforcement learning (RL), offers the promise ...
research
10/06/2021

Nested Policy Reinforcement Learning

Off-policy reinforcement learning (RL) has proven to be a powerful frame...
research
03/12/2021

Discovering Diverse Solutions in Deep Reinforcement Learning

Reinforcement learning (RL) algorithms are typically limited to learning...
research
06/04/2019

Options as responses: Grounding behavioural hierarchies in multi-agent RL

We propose a novel hierarchical agent architecture for multi-agent reinf...
research
02/15/2019

Reinforcement Learning Without Backpropagation or a Clock

In this paper we introduce a reinforcement learning (RL) approach for tr...
research
10/01/2019

Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

In this paper, we aim to develop a simple and scalable reinforcement lea...

Please sign up or login with your details

Forgot password? Click here to reset