Fooling SHAP with Stealthily Biased Sampling

05/30/2022
by   Gabriel Laberge, et al.
0

SHAP explanations aim at identifying which features contribute the most to the difference in model prediction at a specific input versus a background distribution. Recent studies have shown that they can be manipulated by malicious adversaries to produce arbitrary desired explanations. However, existing attacks focus solely on altering the black-box model itself. In this paper, we propose a complementary family of attacks that leave the model intact and manipulate SHAP explanations using stealthily biased sampling of the data points used to approximate expectations w.r.t the background distribution. In the context of fairness audit, we show that our attack can reduce the importance of a sensitive feature when explaining the difference in outcomes between groups, while remaining undetected. These results highlight the manipulability of SHAP explanations and encourage auditors to treat post-hoc explanations with skepticism.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2023

BELLA: Black box model Explanations by Local Linear Approximations

In recent years, understanding the decision-making process of black-box ...
research
12/20/2020

Biased Models Have Biased Explanations

We study fairness in Machine Learning (FairML) through the lens of attri...
research
06/20/2022

Eliminating The Impossible, Whatever Remains Must Be True

The rise of AI methods to make predictions and decisions has led to a pr...
research
10/19/2022

Black Box Model Explanations and the Human Interpretability Expectations – An Analysis in the Context of Homicide Prediction

Strategies based on Explainable Artificial Intelligence - XAI have promo...
research
05/26/2021

Fooling Partial Dependence via Data Poisoning

Many methods have been developed to understand complex predictive models...
research
06/29/2021

Semantic Reasoning from Model-Agnostic Explanations

With the wide adoption of black-box models, instance-based post hoc expl...

Please sign up or login with your details

Forgot password? Click here to reset