Gradient strikes back: How filtering out high frequencies improves explanations

07/18/2023
by   Sabine Muzellec, et al.
0

Recent years have witnessed an explosion in the development of novel prediction-based attribution methods, which have slowly been supplanting older gradient-based methods to explain the decisions of deep neural networks. However, it is still not clear why prediction-based methods outperform gradient-based ones. Here, we start with an empirical observation: these two approaches yield attribution maps with very different power spectra, with gradient-based methods revealing more high-frequency content than prediction-based methods. This observation raises multiple questions: What is the source of this high-frequency information, and does it truly reflect decisions made by the system? Lastly, why would the absence of high-frequency information in prediction-based methods yield better explainability scores along multiple metrics? We analyze the gradient of three representative visual classification models and observe that it contains noisy information emanating from high-frequencies. Furthermore, our analysis reveals that the operations used in Convolutional Neural Networks (CNNs) for downsampling appear to be a significant source of this high-frequency content – suggesting aliasing as a possible underlying basis. We then apply an optimal low-pass filter for attribution maps and demonstrate that it improves gradient-based attribution methods. We show that (i) removing high-frequency noise yields significant improvements in the explainability scores obtained with gradient-based methods across multiple models – leading to (ii) a novel ranking of state-of-the-art methods with gradient-based methods at the top. We believe that our results will spur renewed interest in simpler and computationally more efficient gradient-based methods for explainability.

READ FULL TEXT

page 2

page 4

page 6

page 8

page 15

page 17

page 20

research
07/23/2021

Robust Explainability: A Tutorial on Gradient-Based Attribution Methods for Deep Neural Networks

With the rise of deep neural networks, the challenge of explaining the p...
research
12/03/2020

Visualization of Supervised and Self-Supervised Neural Networks via Attribution Guided Factorization

Neural network visualization techniques mark image locations by their re...
research
06/22/2023

Pre or Post-Softmax Scores in Gradient-based Attribution Methods, What is Best?

Gradient based attribution methods for neural networks working as classi...
research
10/14/2020

Learning Propagation Rules for Attribution Map Generation

Prior gradient-based attribution-map methods rely on handcrafted propaga...
research
05/25/2022

How explainable are adversarially-robust CNNs?

Three important criteria of existing convolutional neural networks (CNNs...
research
12/06/2021

What I Cannot Predict, I Do Not Understand: A Human-Centered Evaluation Framework for Explainability Methods

A multitude of explainability methods and theoretical evaluation scores ...
research
05/04/2023

Distributing Synergy Functions: Unifying Game-Theoretic Interaction Methods for Machine-Learning Explainability

Deep learning has revolutionized many areas of machine learning, from co...

Please sign up or login with your details

Forgot password? Click here to reset