Evaluation of Interpretability Methods and Perturbation Artifacts in Deep Neural Networks

03/06/2022
by   Lennart Brocki, et al.
0

The challenge of interpreting predictions from deep neural networks has prompted the development of numerous interpretability methods. Many of interpretability methods attempt to quantify the importance of input features with respect to the class probabilities, and are called importance estimators or saliency maps. A popular approach to evaluate such interpretability methods is to perturb input features deemed important for predictions and observe the decrease in accuracy. However, perturbation-based evaluation methods may confound the sources of accuracy degradation. We conduct computational experiments that allow to empirically estimate the fidelity of interpretability methods and the contribution of perturbation artifacts. All considered importance estimators clearly outperform a random baseline, which contradicts the findings of ROAR [arXiv:1806.10758]. We further compare our results to the crop-and-resize evaluation framework [arXiv:1705.07857], which are largely in agreement. Our study suggests that we can estimate the impact of artifacts and thus empirically evaluate interpretability methods without retraining.

READ FULL TEXT

page 8

page 15

research
03/02/2023

Feature Perturbation Augmentation for Reliable Evaluation of Importance Estimators

Post-hoc explanation methods attempt to make the inner workings of deep ...
research
06/28/2018

Evaluating Feature Importance Estimates

Estimating the influence of a given feature to a model prediction is cha...
research
09/30/2022

Evaluation of importance estimators in deep learning classifiers for Computed Tomography

Deep learning has shown superb performance in detecting objects and clas...
research
02/01/2021

Counterfactual Generation with Knockoffs

Human interpretability of deep neural networks' decisions is crucial, es...
research
02/15/2022

Don't Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis

A variety of methods have been proposed to try to explain how deep neura...
research
06/12/2022

A Functional Information Perspective on Model Interpretation

Contemporary predictive models are hard to interpret as their deep nets ...
research
06/06/2018

Towards Dependability Metrics for Neural Networks

Neural networks and other data engineered models are instrumental in dev...

Please sign up or login with your details

Forgot password? Click here to reset