A Symbolic Approach for Counterfactual Explanations
In this paper titled A Symbolic Approach for Counterfactual Explanations we propose a novel symbolic approach to provide counterfactual explanations for a classifier predictions. Contrary to most explanation approaches where the goal is to understand which and to what extent parts of the data helped to give a prediction, counterfactual explanations indicate which features must be changed in the data in order to change this classifier prediction. Our approach is symbolic in the sense that it is based on encoding the decision function of a classifier in an equivalent CNF formula. In this approach, counterfactual explanations are seen as the Minimal Correction Subsets (MCS), a well-known concept in knowledge base reparation. Hence, this approach takes advantage of the strengths of already existing and proven solutions for the generation of MCS. Our preliminary experimental studies on Bayesian classifiers show the potential of this approach on several datasets.
READ FULL TEXT