Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach

01/21/2020
by   Carlos Fernandez, et al.
20

Lack of understanding of the decisions made by model-based AI systems is an important barrier for their adoption. We examine counterfactual explanations as an alternative for explaining AI decisions. The counterfactual approach defines an explanation as a set of the system's data inputs that causally drives the decision (meaning that removing them changes the decision) and is irreducible (meaning that removing any subset of the inputs in the explanation does not change the decision). We generalize previous work on counterfactual explanations, resulting in a framework that (a) is model-agnostic, (b) can address features with arbitrary data types, (c) is able explain decisions made by complex AI systems that incorporate multiple models, and (d) is scalable to large numbers of features. We also propose a heuristic procedure to find the most useful explanations depending on the context. We contrast counterfactual explanations with another alternative: methods that explain model predictions by weighting features according to their importance (e.g., SHAP, LIME). This paper presents two fundamental reasons why explaining model predictions is not the same as explaining the decisions made using those predictions, suggesting we should carefully consider whether importance-weight explanations are well-suited to explain decisions made by AI systems. Specifically, we show that (1) features that have a large importance weight for a model prediction may not actually affect the corresponding decision, and (2) importance weights are insufficient to communicate whether and how features influence system decisions. We demonstrate this using several examples, including three detailed studies using real-world data that compare the counterfactual approach with SHAP and illustrate various conditions under which counterfactual explanations explain data-driven decisions better than feature importance weights.

READ FULL TEXT

page 6

page 17

page 27

research
04/16/2020

Explainable Image Classification with Evidence Counterfactual

The complexity of state-of-the-art modeling techniques for image classif...
research
07/19/2022

Alterfactual Explanations – The Relevance of Irrelevance for Explaining AI Systems

Explanation mechanisms from the field of Counterfactual Thinking are a w...
research
01/24/2023

Explainable Data-Driven Optimization: From Context to Decision and Back Again

Data-driven optimization uses contextual information and machine learnin...
research
01/27/2023

Even if Explanations: Prior Work, Desiderata Benchmarks for Semi-Factual XAI

Recently, eXplainable AI (XAI) research has focused on counterfactual ex...
research
05/31/2021

An exact counterfactual-example-based approach to tree-ensemble models interpretability

Explaining the decisions of machine learning models is becoming a necess...
research
05/16/2022

Model Agnostic Local Explanations of Reject

The application of machine learning based decision making systems in saf...
research
07/05/2023

Beyond Known Reality: Exploiting Counterfactual Explanations for Medical Research

This study employs counterfactual explanations to explore "what if?" sce...

Please sign up or login with your details

Forgot password? Click here to reset