Counterfactual Explanations Can Be Manipulated

06/04/2021
by   Dylan Slack, et al.
1

Counterfactual explanations are emerging as an attractive option for providing recourse to individuals adversely impacted by algorithmic decisions. As they are deployed in critical applications (e.g. law enforcement, financial lending), it becomes important to ensure that we clearly understand the vulnerabilities of these methods and find ways to address them. However, there is little understanding of the vulnerabilities and shortcomings of counterfactual explanations. In this work, we introduce the first framework that describes the vulnerabilities of counterfactual explanations and shows how they can be manipulated. More specifically, we show counterfactual explanations may converge to drastically different counterfactuals under a small perturbation indicating they are not robust. Leveraging this insight, we introduce a novel objective to train seemingly fair models where counterfactual explanations find much lower cost recourse under a slight perturbation. We describe how these models can unfairly provide low-cost recourse for specific subgroups in the data while appearing fair to auditors. We perform experiments on loan and violent crime prediction data sets where certain subgroups achieve up to 20x lower cost recourse under the perturbation. These results raise concerns regarding the dependability of current counterfactual explanation techniques, which we hope will inspire investigations in robust counterfactual explanations.

READ FULL TEXT
research
11/27/2019

Actionable Interpretability through Optimizable Counterfactual Explanations for Tree Ensembles

Counterfactual explanations help users understand why machine learned mo...
research
06/23/2020

On Counterfactual Explanations under Predictive Multiplicity

Counterfactual explanations are usually obtained by identifying the smal...
research
06/05/2023

Navigating Explanatory Multiverse Through Counterfactual Path Geometry

Counterfactual explanations are the de facto standard when tasked with i...
research
06/23/2021

Feature Attributions and Counterfactual Explanations Can Be Manipulated

As machine learning models are increasingly used in critical decision-ma...
research
05/11/2022

"If it didn't happen, why would I change my decision?": How Judges Respond to Counterfactual Explanations for the Public Safety Assessment

Many researchers and policymakers have expressed excitement about how al...
research
11/18/2021

MCCE: Monte Carlo sampling of realistic counterfactual explanations

In this paper we introduce MCCE: Monte Carlo sampling of realistic Count...
research
08/16/2023

Endogenous Macrodynamics in Algorithmic Recourse

Existing work on Counterfactual Explanations (CE) and Algorithmic Recour...

Please sign up or login with your details

Forgot password? Click here to reset