Experiential Explanations for Reinforcement Learning

10/10/2022
by   Amal Alabdulkarim, et al.
6

Reinforcement Learning (RL) approaches are becoming increasingly popular in various key disciplines, including robotics and healthcare. However, many of these systems are complex and non-interpretable, making it challenging for non-AI experts to understand or intervene. One of the challenges of explaining RL agent behavior is that, when learning to predict future expected reward, agents discard contextual information about their experiences when training in an environment and rely solely on expected utility. We propose a technique, Experiential Explanations, for generating local counterfactual explanations that can answer users' why-not questions by explaining qualitatively the effects of the various environmental rewards on the agent's behavior. We achieve this by training additional modules alongside the policy. These models, called influence predictors, model how different reward sources influence the agent's policy, thus restoring lost contextual information about how the policy reflects the environment. To generate explanations, we use these models in addition to the policy to contrast between the agent's intended behavior trajectory and a counterfactual trajectory suggested by the user.

READ FULL TEXT
research
02/24/2023

GANterfactual-RL: Understanding Reinforcement Learning Agents' Strategies through Visual Counterfactual Explanations

Counterfactual explanations are a common tool to explain artificial inte...
research
01/29/2022

Explaining Reinforcement Learning Policies through Counterfactual Trajectories

In order for humans to confidently decide where to employ RL agents for ...
research
03/08/2023

RACCER: Towards Reachable and Certain Counterfactual Explanations for Reinforcement Learning

While reinforcement learning (RL) algorithms have been successfully appl...
research
10/21/2022

Integrating Policy Summaries with Reward Decomposition for Explaining Reinforcement Learning Agents

Explaining the behavior of reinforcement learning agents operating in se...
research
07/23/2018

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

Machine Learning models become increasingly proficient in complex tasks....
research
11/10/2020

What Did You Think Would Happen? Explaining Agent Behaviour Through Intended Outcomes

We present a novel form of explanation for Reinforcement Learning, based...
research
11/17/2022

Explainability Via Causal Self-Talk

Explaining the behavior of AI systems is an important problem that, in p...

Please sign up or login with your details

Forgot password? Click here to reset