Preserving Causal Constraints in Counterfactual Explanations for Machine Learning Classifiers

12/06/2019
by   Divyat Mahajan, et al.
0

Explaining the output of a complex machine learning (ML) model often requires approximation using a simpler model. To construct interpretable explanations that are also consistent with the original ML model, counterfactual examples — showing how the model's output changes with small perturbations to the input — have been proposed. This paper extends the work in counterfactual explanations by addressing the challenge of feasibility of such examples. For explanations of ML models in critical domains such as healthcare, finance, etc, counterfactual examples are useful for an end-user only to the extent that perturbation of feature inputs is feasible in the real world. We formulate the problem of feasibility as preserving causal relationships among input features and present a method that uses (partial) structural causal models to generate actionable counterfactuals. When feasibility constraints may not be easily expressed, we propose an alternative method that optimizes for feasibility as people interact with its output and provide oracle-like feedback. Our experiments on a Bayesian network and the widely used "Adult" dataset show that our proposed methods can generate counterfactual explanations that satisfy feasibility constraints.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2019

Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations

Post-hoc explanations of machine learning models are crucial for people ...
research
10/12/2022

Feasible and Desirable Counterfactual Generation by Preserving Human Defined Constraints

We present a human-in-the-loop approach to generate counterfactual (CF) ...
research
10/06/2021

Consistent Counterfactuals for Deep Models

Counterfactual examples are one of the most commonly-cited methods for e...
research
09/17/2020

Counterfactual Generation and Fairness Evaluation Using Adversarially Learned Inference

Recent studies have reported biases in machine learning image classifier...
research
06/18/2021

On the Connections between Counterfactual Explanations and Adversarial Examples

Counterfactual explanations and adversarial examples have emerged as cri...
research
01/21/2023

Bayesian Hierarchical Models for Counterfactual Estimation

Counterfactual explanations utilize feature perturbations to analyze the...
research
02/15/2022

Realistic Counterfactual Explanations by Learned Relations

Many existing methods of counterfactual explanations ignore the intrinsi...

Please sign up or login with your details

Forgot password? Click here to reset