GAN-based Generation and Automatic Selection of Explanations for Neural Networks

by   Saumitra Mishra, et al.

One way to interpret trained deep neural networks (DNNs) is by inspecting characteristics that neurons in the model respond to, such as by iteratively optimising the model input (e.g., an image) to maximally activate specific neurons. However, this requires a careful selection of hyper-parameters to generate interpretable examples for each neuron of interest, and current methods rely on a manual, qualitative evaluation of each setting, which is prohibitively slow. We introduce a new metric that uses Fréchet Inception Distance (FID) to encourage similarity between model activations for real and generated data. This provides an efficient way to evaluate a set of generated examples for each setting of hyper-parameters. We also propose a novel GAN-based method for generating explanations that enables an efficient search through the input space and imposes a strong prior favouring realistic outputs. We apply our approach to a classification model trained to predict whether a music audio recording contains singing voice. Our results suggest that this proposed metric successfully selects hyper-parameters leading to interpretable examples, avoiding the need for manual evaluation. Moreover, we see that examples synthesised to maximise or minimise the predicted probability of singing voice presence exhibit vocal or non-vocal characteristics, respectively, suggesting that our approach is able to generate suitable explanations for understanding concepts learned by a neural network.


Towards Visual Explanations for Convolutional Neural Networks via Input Resampling

The predictive power of neural networks often costs model interpretabili...

Synthesizing the preferred inputs for neurons in neural networks via deep generator networks

Deep neural networks (DNNs) have demonstrated state-of-the-art results o...

Neuron to Graph: Interpreting Language Model Neurons at Scale

Advances in Large Language Models (LLMs) have led to remarkable capabili...

Detection Accuracy for Evaluating Compositional Explanations of Units

The recent success of deep learning models in solving complex problems a...

audioLIME: Listenable Explanations Using Source Separation

Deep neural networks (DNNs) are successfully applied in a wide variety o...

Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks

To verify and validate networks, it is essential to gain insight into th...

MAIRE – A Model-Agnostic Interpretable Rule Extraction Procedure for Explaining Classifiers

The paper introduces a novel framework for extracting model-agnostic hum...

Please sign up or login with your details

Forgot password? Click here to reset