DeepAI AI Chat
Log In Sign Up

The privacy issue of counterfactual explanations: explanation linkage attacks

by   Sofie Goethals, et al.

Black-box machine learning models are being used in more and more high-stakes domains, which creates a growing need for Explainable AI (XAI). Unfortunately, the use of XAI in machine learning introduces new privacy risks, which currently remain largely unnoticed. We introduce the explanation linkage attack, which can occur when deploying instance-based strategies to find counterfactual explanations. To counter such an attack, we propose k-anonymous counterfactual explanations and introduce pureness as a new metric to evaluate the validity of these k-anonymous counterfactual explanations. Our results show that making the explanations, rather than the whole dataset, k- anonymous, is beneficial for the quality of the explanations.


page 8

page 13


On the computation of counterfactual explanations – A survey

Due to the increasing use of machine learning in practice it becomes mor...

Counterfactual Explanations for Misclassified Images: How Human and Machine Explanations Differ

Counterfactual explanations have emerged as a popular solution for the e...

Disagreement amongst counterfactual explanations: How transparency can be deceptive

Counterfactual explanations are increasingly used as an Explainable Arti...

Causal Explanations and XAI

Although standard Machine Learning models are optimized for making predi...

Adequate and fair explanations

Explaining sophisticated machine-learning based systems is an important ...

Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?

Algorithmic approaches to interpreting machine learning models have prol...