Disentangling causal effects for hierarchical reinforcement learning

10/03/2020
by   Oriol Corcoll, et al.
42

Exploration and credit assignment under sparse rewards are still challenging problems. We argue that these challenges arise in part due to the intrinsic rigidity of operating at the level of actions. Actions can precisely define how to perform an activity but are ill-suited to describe what activity to perform. Instead, causal effects are inherently composable and temporally abstract, making them ideal for descriptive tasks. By leveraging a hierarchy of causal effects, this study aims to expedite the learning of task-specific behavior and aid exploration. Borrowing counterfactual and normality measures from causal literature, we disentangle controllable effects from effects caused by other dynamics of the environment. We propose CEHRL, a hierarchical method that models the distribution of controllable effects using a Variational Autoencoder. This distribution is used by a high-level policy to 1) explore the environment via random effect exploration so that novel effects are continuously discovered and learned, and to 2) learn task-specific behavior by prioritizing the effects that maximize a given reward function. In comparison to exploring with random actions, experimental results show that random effect exploration is a more efficient mechanism and that by assigning credit to few effects rather than many actions, CEHRL learns tasks more rapidly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/22/2022

Towards Causal Credit Assignment

Adequately assigning credit to actions for future outcomes based on thei...
research
06/01/2021

Did I do that? Blame as a means to identify controlled effects in reinforcement learning

Modeling controllable aspects of the environment enable better prioritiz...
research
05/13/2019

Learning and Exploiting Multiple Subgoals for Fast Exploration in Hierarchical Reinforcement Learning

Hierarchical Reinforcement Learning (HRL) exploits temporally extended a...
research
10/06/2020

Safety Aware Reinforcement Learning (SARL)

As reinforcement learning agents become increasingly integrated into com...
research
02/26/2018

Disentangling the independently controllable factors of variation by interacting with the world

It has been postulated that a good representation is one that disentangl...
research
08/03/2017

Independently Controllable Factors

It has been postulated that a good representation is one that disentangl...
research
06/12/2019

Fast Task Inference with Variational Intrinsic Successor Features

It has been established that diverse behaviors spanning the controllable...

Please sign up or login with your details

Forgot password? Click here to reset