Counterfactual Explanations for Reinforcement Learning

10/21/2022
by   Jasmina Gajcin, et al.
0

While AI algorithms have shown remarkable success in various fields, their lack of transparency hinders their application to real-life tasks. Although explanations targeted at non-experts are necessary for user trust and human-AI collaboration, the majority of explanation methods for AI are focused on developers and expert users. Counterfactual explanations are local explanations that offer users advice on what can be changed in the input for the output of the black-box model to change. Counterfactuals are user-friendly and provide actionable advice for achieving the desired output from the AI system. While extensively researched in supervised learning, there are few methods applying them to reinforcement learning (RL). In this work, we explore the reasons for the underrepresentation of a powerful explanation method in RL. We start by reviewing the current work in counterfactual explanations in supervised learning. Additionally, we explore the differences between counterfactual explanations in supervised learning and RL and identify the main challenges that prevent adoption of methods from supervised in reinforcement learning. Finally, we redefine counterfactuals for RL and propose research directions for implementing counterfactuals in RL.

READ FULL TEXT

page 2

page 8

page 18

page 31

research
03/08/2023

RACCER: Towards Reachable and Certain Counterfactual Explanations for Reinforcement Learning

While reinforcement learning (RL) algorithms have been successfully appl...
research
01/29/2021

Counterfactual State Explanations for Reinforcement Learning Agents via Generative Deep Learning

Counterfactual explanations, which deal with "why not?" scenarios, can p...
research
08/03/2021

Accelerating the Convergence of Human-in-the-Loop Reinforcement Learning with Counterfactual Explanations

The capability to interactively learn from human feedback would enable r...
research
12/01/2022

Decisions that Explain Themselves: A User-Centric Deep Reinforcement Learning Explanation System

With deep reinforcement learning (RL) systems like autonomous driving be...
research
07/25/2023

Counterfactual Explanation Policies in RL

As Reinforcement Learning (RL) agents are increasingly employed in diver...
research
01/27/2023

Even if Explanations: Prior Work, Desiderata Benchmarks for Semi-Factual XAI

Recently, eXplainable AI (XAI) research has focused on counterfactual ex...
research
07/18/2020

Quick Question: Interrupting Users for Microtasks with Reinforcement Learning

Human attention is a scarce resource in modern computing. A multitude of...

Please sign up or login with your details

Forgot password? Click here to reset