Gradient-based Counterfactual Explanations using Tractable Probabilistic Models

05/16/2022
by   Xiaoting Shao, et al.
12

Counterfactual examples are an appealing class of post-hoc explanations for machine learning models. Given input x of class y_1, its counterfactual is a contrastive example x^' of another class y_0. Current approaches primarily solve this task by a complex optimization: define an objective function based on the loss of the counterfactual outcome y_0 with hard or soft constraints, then optimize this function as a black-box. This "deep learning" approach, however, is rather slow, sometimes tricky, and may result in unrealistic counterfactual examples. In this work, we propose a novel approach to deal with these problems using only two gradient computations based on tractable probabilistic models. First, we compute an unconstrained counterfactual u of x to induce the counterfactual outcome y_0. Then, we adapt u to higher density regions, resulting in x^'. Empirical evidence demonstrates the dominant advantages of our approach.

READ FULL TEXT

page 8

page 9

page 12

page 19

research
02/12/2020

Convex Density Constraints for Computing Plausible Counterfactual Explanations

The increasing deployment of machine learning as well as legal regulatio...
research
04/23/2020

Multi-Objective Counterfactual Explanations

Counterfactual explanations are one of the most popular methods to make ...
research
07/22/2019

The Dangers of Post-hoc Interpretability: Unjustified Counterfactual Explanations

Post-hoc interpretability approaches have been proven to be powerful too...
research
09/27/2022

Learning to Counter: Stochastic Feature-based Learning for Diverse Counterfactual Explanations

Interpretable machine learning seeks to understand the reasoning process...
research
04/17/2021

Optimal Counterfactual Explanations for Scorecard modelling

Counterfactual explanations is one of the post-hoc methods used to provi...
research
03/25/2021

ECINN: Efficient Counterfactuals from Invertible Neural Networks

Counterfactual examples identify how inputs can be altered to change the...
research
01/21/2023

Bayesian Hierarchical Models for Counterfactual Estimation

Counterfactual explanations utilize feature perturbations to analyze the...

Please sign up or login with your details

Forgot password? Click here to reset