A Survey on the Robustness of Feature Importance and Counterfactual Explanations

10/30/2021
by   Saumitra Mishra, et al.
0

There exist several methods that aim to address the crucial task of understanding the behaviour of AI/ML models. Arguably, the most popular among them are local explanations that focus on investigating model behaviour for individual instances. Several methods have been proposed for local analysis, but relatively lesser effort has gone into understanding if the explanations are robust and accurately reflect the behaviour of underlying models. In this work, we present a survey of the works that analysed the robustness of two classes of local explanations (feature importance and counterfactual explanations) that are popularly used in analysing AI/ML models in finance. The survey aims to unify existing definitions of robustness, introduces a taxonomy to classify different robustness approaches, and discusses some interesting results. Finally, the survey introduces some pointers about extending current robustness analysis approaches so as to identify reliable explainability methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2022

Robust Counterfactual Explanations for Random Forests

Counterfactual explanations describe how to modify a feature vector in o...
research
04/14/2022

Global Counterfactual Explanations: Investigations, Implementations and Improvements

Counterfactual explanations have been widely studied in explainability, ...
research
11/09/2022

On the Robustness of Explanations of Deep Neural Network Models: A Survey

Explainability has been widely stated as a cornerstone of the responsibl...
research
05/26/2023

GLOBE-CE: A Translation-Based Approach for Global Counterfactual Explanations

Counterfactual explanations have been widely studied in explainability, ...
research
04/30/2022

ExSum: From Local Explanations to Model Understanding

Interpretability methods are developed to understand the working mechani...
research
01/22/2022

On the Robustness of Counterfactual Explanations to Adverse Perturbations

Counterfactual explanations (CEs) are a powerful means for understanding...
research
06/24/2021

On Locality of Local Explanation Models

Shapley values provide model agnostic feature attributions for model out...

Please sign up or login with your details

Forgot password? Click here to reset