DeepAI AI Chat
Log In Sign Up

Improving Deep Learning Interpretability by Saliency Guided Training

by   Aya Abdelsalam Ismail, et al.
University of Maryland

Saliency methods have been widely used to highlight important input features in model predictions. Most existing methods use backpropagation on a modified gradient function to generate saliency maps. Thus, noisy gradients can result in unfaithful feature attributions. In this paper, we tackle this issue and introduce a saliency guided trainingprocedure for neural networks to reduce noisy gradients used in predictions while retaining the predictive performance of the model. Our saliency guided training procedure iteratively masks features with small and potentially noisy gradients while maximizing the similarity of model outputs for both masked and unmasked inputs. We apply the saliency guided training procedure to various synthetic and real data sets from computer vision, natural language processing, and time series across diverse neural architectures, including Recurrent Neural Networks, Convolutional Networks, and Transformers. Through qualitative and quantitative evaluations, we show that saliency guided training procedure significantly improves model interpretability across various domains while preserving its predictive performance.


page 5

page 8

page 16

page 17

page 18

page 20

page 21

page 22


Benchmarking Deep Learning Interpretability in Time Series Predictions

Saliency methods are used extensively to highlight the importance of inp...

Why are Saliency Maps Noisy? Cause of and Solution to Noisy Saliency Maps

Saliency Map, the gradient of the score function with respect to the inp...

Full-Jacobian Representation of Neural Networks

Non-linear functions such as neural networks can be locally approximated...

Abs-CAM: A Gradient Optimization Interpretable Approach for Explanation of Convolutional Neural Networks

The black-box nature of Deep Neural Networks (DNNs) severely hinders its...

Gradient Mask: Lateral Inhibition Mechanism Improves Performance in Artificial Neural Networks

Lateral inhibitory connections have been observed in the cortex of the b...

Rethinking Positive Aggregation and Propagation of Gradients in Gradient-based Saliency Methods

Saliency methods interpret the prediction of a neural network by showing...

Input-Cell Attention Reduces Vanishing Saliency of Recurrent Neural Networks

Recent efforts to improve the interpretability of deep neural networks u...