On the Robustness of Interpretability Methods

06/21/2018
by   David Alvarez-Melis, et al.
0

We argue that robustness of explanations---i.e., that similar inputs should give rise to similar explanations---is a key desideratum for interpretability. We introduce metrics to quantify robustness and demonstrate that current methods do not perform well according to these metrics. Finally, we propose ways that robustness can be enforced on existing interpretability approaches.

READ FULL TEXT

page 2

page 4

page 5

research
05/27/2019

Analyzing the Interpretability Robustness of Self-Explaining Models

Recently, interpretable models called self-explaining models (SEMs) have...
research
04/13/2023

Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance

Interpretability methods are valuable only if their explanations faithfu...
research
09/12/2022

Model interpretation using improved local regression with variable importance

A fundamental question on the use of ML models concerns the explanation ...
research
01/25/2023

Towards Robust Metrics for Concept Representation Evaluation

Recent work on interpretability has focused on concept-based explanation...
research
08/16/2023

Interpretability Benchmark for Evaluating Spatial Misalignment of Prototypical Parts Explanations

Prototypical parts-based networks are becoming increasingly popular due ...
research
07/04/2022

Comparing Feature Importance and Rule Extraction for Interpretability on Text Data

Complex machine learning algorithms are used more and more often in crit...
research
09/22/2020

What Do You See? Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors

EXplainable AI (XAI) methods have been proposed to interpret how a deep ...

Please sign up or login with your details

Forgot password? Click here to reset