How Sensitive are Sensitivity-Based Explanations?

01/27/2019
by   Chih-Kuan Yeh, et al.
18

We propose a simple objective evaluation measure for explanations of a complex black-box machine learning model. While most such model explanations have largely been evaluated via qualitative measures, such as how humans might qualitatively perceive the explanations, it is vital to also consider objective measures such as the one we propose in this paper. Our evaluation measure that we naturally call sensitivity is simple: it characterizes how an explanation changes as we vary the test input, and depending on how we measure these changes, and how we vary the input, we arrive at different notions of sensitivity. We also provide a calculus for deriving sensitivity of complex explanations in terms of that for simpler explanations, which thus allows an easy computation of sensitivities for yet to be proposed explanations. One advantage of an objective evaluation measure is that we can optimize the explanation with respect to the measure: we show that (1) any given explanation can be simply modified to improve its sensitivity with just a modest deviation from the original explanation, and (2) gradient based explanations of an adversarially trained network are less sensitive. Perhaps surprisingly, our experiments show that explanations optimized to have lower sensitivity can be more faithful to the model predictions.

READ FULL TEXT

page 12

page 20

page 22

page 23

research
06/21/2023

Evaluating the overall sensitivity of saliency-based explanation methods

We address the need to generate faithful explanations of "black box" Dee...
research
05/01/2020

Evaluating and Aggregating Feature-based Model Explanations

A feature-based model explanation denotes how much each input feature co...
research
05/15/2020

Reliable Local Explanations for Machine Listening

One way to analyse the behaviour of machine learning models is through l...
research
10/19/2021

Coalitional Bayesian Autoencoders – Towards explainable unsupervised deep learning

This paper aims to improve the explainability of Autoencoder's (AE) pred...
research
10/08/2018

Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values

Explaining the output of a complicated machine learning model like a dee...
research
01/14/2022

When less is more: Simplifying inputs aids neural network understanding

How do neural network image classifiers respond to simpler and simpler i...
research
06/16/2020

High Dimensional Model Explanations: an Axiomatic Approach

Complex black-box machine learning models are regularly used in critical...

Please sign up or login with your details

Forgot password? Click here to reset