Interpretable Reinforcement Learning with Multilevel Subgoal Discovery

02/15/2022
by   Alexander Demin, et al.
0

We propose a novel Reinforcement Learning model for discrete environments, which is inherently interpretable and supports the discovery of deep subgoal hierarchies. In the model, an agent learns information about environment in the form of probabilistic rules, while policies for (sub)goals are learned as combinations thereof. No reward function is required for learning; an agent only needs to be given a primary goal to achieve. Subgoals of a goal G from the hierarchy are computed as descriptions of states, which if previously achieved increase the total efficiency of the available policies for G. These state descriptions are introduced as new sensor predicates into the rule language of the agent, which allows for sensing important intermediate states and for updating environment rules and policies accordingly.

READ FULL TEXT
research
08/08/2017

Investigating Reinforcement Learning Agents for Continuous State Space Environments

Given an environment with continuous state spaces and discrete actions, ...
research
05/01/2022

Learning user-defined sub-goals using memory editing in reinforcement learning

The aim of reinforcement learning (RL) is to allow the agent to achieve ...
research
09/12/2023

Goal Space Abstraction in Hierarchical Reinforcement Learning via Reachability Analysis

Open-ended learning benefits immensely from the use of symbolic methods ...
research
02/21/2020

Language as a Cognitive Tool to Imagine Goals in Curiosity-Driven Exploration

Autonomous reinforcement learning agents must be intrinsically motivated...
research
07/24/2019

Unsupervised Discovery of Decision States for Transfer in Reinforcement Learning

We present a hierarchical reinforcement learning (HRL) or options framew...
research
10/18/2019

RTFM: Generalising to Novel Environment Dynamics via Reading

Obtaining policies that can generalise to new environments in reinforcem...
research
06/22/2019

A neurally plausible model learns successor representations in partially observable environments

Animals need to devise strategies to maximize returns while interacting ...

Please sign up or login with your details

Forgot password? Click here to reset