Reinforcement Learning from a Mixture of Interpretable Experts

06/10/2020
by   Riad Akrour, et al.
0

Reinforcement learning (RL) has demonstrated its ability to solve high dimensional tasks by leveraging non-linear function approximators. These successes however are mostly achieved by 'black-box' policies in simulated domains. When deploying RL to the real world, several concerns regarding the use of a 'black-box' policy might be raised. In an effort to make the policies learned by RL more transparent, we propose in this paper a policy iteration scheme that retains a complex function approximator for its internal value predictions but constrains the policy to have a concise, hierarchical, and human-readable structure, based on a mixture of interpretable experts. We show that our proposed algorithm can learn compelling policies on continuous action deep RL benchmarks, matching the performance of neural network based policies, but returns policies that are more amenable to human inspection than neural network or linear-in-feature policies.

READ FULL TEXT
research
03/22/2018

Neuronal Circuit Policies

We propose an effective way to create interpretable control agents, by r...
research
01/21/2022

Learning Two-Step Hybrid Policy for Graph-Based Interpretable Reinforcement Learning

We present a two-step hybrid reinforcement learning (RL) policy that is ...
research
06/10/2021

Synthesising Reinforcement Learning Policies through Set-Valued Inductive Rule Learning

Today's advanced Reinforcement Learning algorithms produce black-box pol...
research
06/03/2021

Iterative Empirical Game Solving via Single Policy Best Response

Policy-Space Response Oracles (PSRO) is a general algorithmic framework ...
research
05/12/2023

S-REINFORCE: A Neuro-Symbolic Policy Gradient Approach for Interpretable Reinforcement Learning

This paper presents a novel RL algorithm, S-REINFORCE, which is designed...
research
06/14/2021

Learning-Aided Heuristics Design for Storage System

Computer systems such as storage systems normally require transparent wh...
research
02/18/2023

HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare

Reinforcement learning (RL) has been extensively researched for enhancin...

Please sign up or login with your details

Forgot password? Click here to reset