Counterfactual State Explanations for Reinforcement Learning Agents via Generative Deep Learning

01/29/2021
by   Matthew L. Olson, et al.
18

Counterfactual explanations, which deal with "why not?" scenarios, can provide insightful explanations to an AI agent's behavior. In this work, we focus on generating counterfactual explanations for deep reinforcement learning (RL) agents which operate in visual input environments like Atari. We introduce counterfactual state explanations, a novel example-based approach to counterfactual explanations based on generative deep learning. Specifically, a counterfactual state illustrates what minimal change is needed to an Atari game image such that the agent chooses a different action. We also evaluate the effectiveness of counterfactual states on human participants who are not machine learning experts. Our first user study investigates if humans can discern if the counterfactual state explanations are produced by the actual game or produced by a generative deep learning approach. Our second user study investigates if counterfactual state explanations can help non-expert participants identify a flawed agent; we compare against a baseline approach based on a nearest neighbor explanation which uses images from the actual game. Our results indicate that counterfactual state explanations have sufficient fidelity to the actual game images to enable non-experts to more effectively identify a flawed RL agent compared to the nearest neighbor baseline and to having no explanation at all.

READ FULL TEXT

page 2

page 19

page 25

page 29

page 30

page 31

page 32

page 37

research
09/27/2019

Counterfactual States for Atari Agents via Generative Deep Learning

Although deep reinforcement learning agents have produced impressive res...
research
02/24/2023

GANterfactual-RL: Understanding Reinforcement Learning Agents' Strategies through Visual Counterfactual Explanations

Counterfactual explanations are a common tool to explain artificial inte...
research
12/01/2022

Decisions that Explain Themselves: A User-Centric Deep Reinforcement Learning Explanation System

With deep reinforcement learning (RL) systems like autonomous driving be...
research
10/21/2022

Counterfactual Explanations for Reinforcement Learning

While AI algorithms have shown remarkable success in various fields, the...
research
04/16/2021

MEG: Generating Molecular Counterfactual Explanations for Deep Graph Networks

Explainable AI (XAI) is a research area whose objective is to increase t...
research
01/28/2020

Distal Explanations for Explainable Reinforcement Learning Agents

Causal explanations present an intuitive way to understand the course of...
research
02/05/2020

`Why not give this work to them?' Explaining AI-Moderated Task-Allocation Outcomes using Negotiation Trees

The problem of multi-agent task allocation arises in a variety of scenar...

Please sign up or login with your details

Forgot password? Click here to reset