On the Robustness of Counterfactual Explanations to Adverse Perturbations

01/22/2022
by   Marco Virgolin, et al.
0

Counterfactual explanations (CEs) are a powerful means for understanding how decisions made by algorithms can be changed. Researchers have proposed a number of desiderata that CEs should meet to be practically useful, such as requiring minimal effort to enact, or complying with causal models. We consider a further aspect to improve the usability of CEs: robustness to adverse perturbations, which may naturally happen due to unfortunate circumstances. Since CEs typically prescribe a sparse form of intervention (i.e., only a subset of the features should be changed), we provide two definitions of robustness, which concern, respectively, the features to change and to keep as they are. These definitions are workable in that they can be incorporated as penalty terms in the loss functions that are used for discovering CEs. To experiment with the proposed definitions of robustness, we create and release code where five data sets (commonly used in the field of fair and explainable machine learning) have been enriched with feature-specific annotations that can be used to sample meaningful perturbations. Our experiments show that CEs are often not robust and, if adverse perturbations take place, the intervention they prescribe may require a much larger cost than anticipated, or even become impossible. However, accounting for robustness in the search process, which can be done rather easily, allows discovering robust CEs systematically. Robust CEs are resilient to adverse perturbations: additional intervention to contrast perturbations is much less costly than for non-robust CEs. Our code is available at: https://github.com/marcovirgolin/robust-counterfactuals

READ FULL TEXT

page 6

page 24

research
09/09/2023

Flexible and Robust Counterfactual Explanations with Minimal Satisfiable Perturbations

Counterfactual explanations (CFEs) exemplify how to minimally modify a f...
research
05/03/2023

A Curriculum View of Robust Loss Functions

Robust loss functions are designed to combat the adverse impacts of labe...
research
02/10/2023

CREDENCE: Counterfactual Explanations for Document Ranking

Towards better explainability in the field of information retrieval, we ...
research
10/30/2021

A Survey on the Robustness of Feature Importance and Counterfactual Explanations

There exist several methods that aim to address the crucial task of unde...
research
07/05/2022

Vector Quantisation for Robust Segmentation

The reliability of segmentation models in the medical domain depends on ...
research
05/03/2021

Prototype-based Counterfactual Explanation for Causal Classification

Counterfactual explanation is one branch of interpretable machine learni...

Please sign up or login with your details

Forgot password? Click here to reset