Counterfactual Explanation with Missing Values

by   Kentaro Kanamori, et al.

Counterfactual Explanation (CE) is a post-hoc explanation method that provides a perturbation for altering the prediction result of a classifier. Users can interpret the perturbation as an "action" to obtain their desired decision results. Existing CE methods require complete information on the features of an input instance. However, we often encounter missing values in a given instance, and the previous methods do not work in such a practical situation. In this paper, we first empirically and theoretically show the risk that missing value imputation methods affect the validity of an action, as well as the features that the action suggests changing. Then, we propose a new framework of CE, named Counterfactual Explanation by Pairs of Imputation and Action (CEPIA), that enables users to obtain valid actions even with missing values and clarifies how actions are affected by imputation of the missing values. Specifically, our CEPIA provides a representative set of pairs of an imputation candidate for a given incomplete instance and its optimal action. We formulate the problem of finding such a set as a submodular maximization problem, which can be solved by a simple greedy algorithm with an approximation guarantee. Experimental results demonstrated the efficacy of our CEPIA in comparison with the baselines in the presence of missing values.


page 8

page 24

page 25


Ordered Counterfactual Explanation by Mixed-Integer Linear Optimization

Post-hoc explanation methods for machine learning models have been widel...

Explainable Data Imputation using Constraints

Data values in a dataset can be missing or anomalous due to mishandling ...

Bootstrapping and Multiple Imputation Ensemble Approaches for Missing Data

Presence of missing values in a dataset can adversely affect the perform...

Performance comparison of State-of-the-art Missing Value Imputation Algorithms on Some Bench mark Datasets

Decision making from data involves identifying a set of attributes that ...

Chains of Autoreplicative Random Forests for missing value imputation in high-dimensional datasets

Missing values are a common problem in data science and machine learning...

Missing Values and the Dimensionality of Expected Returns

Combining 100+ cross-sectional predictors requires either dropping 90 da...

Online and Batch Learning Algorithms for Data with Missing Features

We introduce new online and batch algorithms that are robust to data wit...

Please sign up or login with your details

Forgot password? Click here to reset