Improving Deep Learning Interpretability by Saliency Guided Training

11/29/2021
by   Aya Abdelsalam Ismail, et al.
20

Saliency methods have been widely used to highlight important input features in model predictions. Most existing methods use backpropagation on a modified gradient function to generate saliency maps. Thus, noisy gradients can result in unfaithful feature attributions. In this paper, we tackle this issue and introduce a saliency guided trainingprocedure for neural networks to reduce noisy gradients used in predictions while retaining the predictive performance of the model. Our saliency guided training procedure iteratively masks features with small and potentially noisy gradients while maximizing the similarity of model outputs for both masked and unmasked inputs. We apply the saliency guided training procedure to various synthetic and real data sets from computer vision, natural language processing, and time series across diverse neural architectures, including Recurrent Neural Networks, Convolutional Networks, and Transformers. Through qualitative and quantitative evaluations, we show that saliency guided training procedure significantly improves model interpretability across various domains while preserving its predictive performance.

READ FULL TEXT

page 5

page 8

page 16

page 17

page 18

page 20

page 21

page 22

research
10/26/2020

Benchmarking Deep Learning Interpretability in Time Series Predictions

Saliency methods are used extensively to highlight the importance of inp...
research
02/13/2019

Why are Saliency Maps Noisy? Cause of and Solution to Noisy Saliency Maps

Saliency Map, the gradient of the score function with respect to the inp...
research
05/02/2019

Full-Jacobian Representation of Neural Networks

Non-linear functions such as neural networks can be locally approximated...
research
07/08/2022

Abs-CAM: A Gradient Optimization Interpretable Approach for Explanation of Convolutional Neural Networks

The black-box nature of Deep Neural Networks (DNNs) severely hinders its...
research
12/01/2020

Rethinking Positive Aggregation and Propagation of Gradients in Gradient-based Saliency Methods

Saliency methods interpret the prediction of a neural network by showing...
research
06/23/2021

Gradient-Based Interpretability Methods and Binarized Neural Networks

Binarized Neural Networks (BNNs) have the potential to revolutionize the...
research
08/14/2022

Gradient Mask: Lateral Inhibition Mechanism Improves Performance in Artificial Neural Networks

Lateral inhibitory connections have been observed in the cortex of the b...

Please sign up or login with your details

Forgot password? Click here to reset