Exploiting Hierarchy for Learning and Transfer in KL-regularized RL

03/18/2019
by   Dhruva Tirumala, et al.
20

As reinforcement learning agents are tasked with solving more challenging and diverse tasks, the ability to incorporate prior knowledge into the learning system and to exploit reusable structure in solution space is likely to become increasingly important. The KL-regularized expected reward objective constitutes one possible tool to this end. It introduces an additional component, a default or prior behavior, which can be learned alongside the policy and as such partially transforms the reinforcement learning problem into one of behavior modelling. In this work we consider the implications of this framework in cases where both the policy and default behavior are augmented with latent variables. We discuss how the resulting hierarchical structures can be used to implement different inductive biases and how their modularity can benefit transfer. Empirically we find that they can lead to faster learning and transfer on a range of continuous control tasks.

READ FULL TEXT

page 6

page 20

research
05/03/2019

Information asymmetry in KL-regularized RL

Many real world tasks exhibit rich structure that is repeated across dif...
research
10/27/2020

Behavior Priors for Efficient Reinforcement Learning

As we deploy reinforcement learning agents to solve increasingly challen...
research
01/20/2022

Priors, Hierarchy, and Information Asymmetry for Skill Transfer in Reinforcement Learning

The ability to discover behaviours from past experience and transfer the...
research
12/02/2022

Utilizing Prior Solutions for Reward Shaping and Composition in Entropy-Regularized Reinforcement Learning

In reinforcement learning (RL), the ability to utilize prior knowledge f...
research
04/09/2018

Latent Space Policies for Hierarchical Reinforcement Learning

We address the problem of learning hierarchical deep neural network poli...
research
11/04/2021

Towards an Understanding of Default Policies in Multitask Policy Optimization

Much of the recent success of deep reinforcement learning has been drive...
research
09/10/2020

Importance Weighted Policy Learning and Adaption

The ability to exploit prior experience to solve novel problems rapidly ...

Please sign up or login with your details

Forgot password? Click here to reset