Boolean Decision Rules for Reinforcement Learning Policy Summarisation

07/18/2022
by   James McCarthy, et al.
4

Explainability of Reinforcement Learning (RL) policies remains a challenging research problem, particularly when considering RL in a safety context. Understanding the decisions and intentions of an RL policy offer avenues to incorporate safety into the policy by limiting undesirable actions. We propose the use of a Boolean Decision Rules model to create a post-hoc rule-based summary of an agent's policy. We evaluate our proposed approach using a DQN agent trained on an implementation of a lava gridworld and show that, given a hand-crafted feature representation of this gridworld, simple generalised rules can be created, giving a post-hoc explainable summary of the agent's policy. We discuss possible avenues to introduce safety into a RL agent's policy by using rules generated by this rule-based model as constraints imposed on the agent's policy, as well as discuss how creating simple rule summaries of an agent's policy may help in the debugging process of RL agents.

READ FULL TEXT
research
12/28/2022

Don't do it: Safer Reinforcement Learning With Rule-based Guidance

During training, reinforcement learning systems interact with the world ...
research
01/21/2022

Reinforcement Learning Your Way: Agent Characterization through Policy Regularization

The increased complexity of state-of-the-art reinforcement learning (RL)...
research
05/17/2023

A Genetic Fuzzy System for Interpretable and Parsimonious Reinforcement Learning Policies

Reinforcement learning (RL) is experiencing a resurgence in research int...
research
06/10/2021

Synthesising Reinforcement Learning Policies through Set-Valued Inductive Rule Learning

Today's advanced Reinforcement Learning algorithms produce black-box pol...
research
05/26/2020

Efficient Use of heuristics for accelerating XCS-based Policy Learning in Markov Games

In Markov games, playing against non-stationary opponents with learning ...
research
05/06/2023

Explaining RL Decisions with Trajectories

Explanation is a key component for the adoption of reinforcement learnin...
research
11/22/2021

Bridging the gap between learning and heuristic based pushing policies

Non-prehensile pushing actions have the potential to singulate a target ...

Please sign up or login with your details

Forgot password? Click here to reset