Synthesising Reinforcement Learning Policies through Set-Valued Inductive Rule Learning

06/10/2021
by   Youri Coppens, et al.
0

Today's advanced Reinforcement Learning algorithms produce black-box policies, that are often difficult to interpret and trust for a person. We introduce a policy distilling algorithm, building on the CN2 rule mining algorithm, that distills the policy into a rule-based decision system. At the core of our approach is the fact that an RL process does not just learn a policy, a mapping from states to actions, but also produces extra meta-information, such as action values indicating the quality of alternative actions. This meta-information can indicate whether more than one action is near-optimal for a certain state. We extend CN2 to make it able to leverage knowledge about equally-good actions to distill the policy into fewer rules, increasing its interpretability by a person. Then, to ensure that the rules explain a valid, non-degenerate policy, we introduce a refinement algorithm that fine-tunes the rules to obtain good performance when executed in the environment. We demonstrate the applicability of our algorithm on the Mario AI benchmark, a complex task that requires modern reinforcement learning algorithms including neural networks. The explanations we produce capture the learned policy in only a few rules, that allow a person to understand what the black-box agent learned. Source code: https://gitlab.ai.vub.ac.be/yocoppen/svcn2

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2020

Reinforcement Learning from a Mixture of Interpretable Experts

Reinforcement learning (RL) has demonstrated its ability to solve high d...
research
02/08/2022

Local Explanations for Reinforcement Learning

Many works in explainable AI have focused on explaining black-box classi...
research
07/18/2022

Boolean Decision Rules for Reinforcement Learning Policy Summarisation

Explainability of Reinforcement Learning (RL) policies remains a challen...
research
06/11/2015

Bootstrapping Skills

The monolithic approach to policy representation in Markov Decision Proc...
research
09/07/2022

Distilling Deep RL Models Into Interpretable Neuro-Fuzzy Systems

Deep Reinforcement Learning uses a deep neural network to encode a polic...
research
09/20/2020

Interpretable-AI Policies using Evolutionary Nonlinear Decision Trees for Discrete Action Systems

Black-box artificial intelligence (AI) induction methods such as deep re...
research
03/09/2022

Dimensionality Reduction and Prioritized Exploration for Policy Search

Black-box policy optimization is a class of reinforcement learning algor...

Please sign up or login with your details

Forgot password? Click here to reset