A Forward-Backward Approach for Visualizing Information Flow in Deep Networks

11/16/2017 ∙ by Aditya Balu, et al. ∙ 0

We introduce a new, systematic framework for visualizing information flow in deep networks. Specifically, given any trained deep convolutional network model and a given test image, our method produces a compact support in the image domain that corresponds to a (high-resolution) feature that contributes to the given explanation. Our method is both computationally efficient as well as numerically robust. We present several preliminary numerical results that support the benefits of our framework over existing methods.



There are no comments yet.


page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep neural networks have resulted in widespread and compelling advances in a variety of machine learning tasks such as object recognition, image segmentation, anomaly detection, machine translation, and synthesis. However, these advances have often been accompanied by a significant reduction in

interpretability, or the ability to visualize the flow of information being extracted at various layers of abstraction. In contrast to traditional rule-based learning methods (which search for specific, semantically hand-crafted features or patterns), deep networks often produce decisions that are seemingly hard to decipher or justify for a given test data sample, even though their aggregate generalizability measured with respect to a hold-out test dataset is excellent. This issue of unpacking the “black-box” nature of deep networks has been identified as a key issue by several recent works Springenberg et al. (2014); Selvaraju et al. (2016); Shrikumar et al. (2017); Sundararajan et al. (2016).

In this work, we focus on the task of object detection in images. Broadly, algorithms for interpreting the action of deep networks for this task can be grouped as follows: Class-discriminative approaches, such as Class Activation Mappings (CAM) Zhou et al. (2016), or its gradient-based variant Selvaraju et al. (2016), produce a support in the original image domain that approximately corresponds to a given object class detected in that image. However, such methods are coarse and only produce low-resolution visualizations, and as such cannot be directly applied to very high resolution images. On the other hand, pixel-space gradient-based methods such as deconvolution networks Zeiler and Fergus (2014) and guided back-propagation Springenberg et al. (2014) produce fine-grained features in a given image. However, gradient based methods suffer from either significant computational efficiency concerns, or are susceptible to saturation phenomena due to vanishing/exploding gradients, or both. In Shrikumar et al. (2017), this issue is alleviated by suitably using a second reference

input to stabilize the estimates. However, choosing this reference image is qualitative and can be challenging. Finally, model-agnostic approaches such as LIME 

Ribeiro et al. (2016) are theoretically sound and can be applied for interpreting deep convolution networks, but involve solving challenging optimization problems.

In this short paper, we outline a systematic framework for visualizing information flow in deep convolutional networks that resolves both the computational efficiency as well as the numerical robustness issues described above. We present several preliminary numerical results that support the benefits of our framework over existing methods.

At a high level, our approach is based on a novel forward-backward scheme which operates as follows. Consider a trained deep convolutional network model and a given test image for which our model is able to identify the existence of a given target class. Then, our algorithm produces as output, a support (i.e., a subset of pixel locations) corresponding to the class predicted by our model in a manner similar to pixel-space gradient methods. However, in contrast with gradient-based approaches, our algorithm not only leverages the backward (class) information flow from the output layer(s) to the input, but also the the forward (image) information extracted at various layers of abstraction. See Figure 1.

Figure 1: Overview of the forward-backward scheme for visualizing the information flow. Forward information (F) of the model is combined with the backward information (B) and flow throughout the network.

More specifically, our method has the following distinguishing characteristics:

  1. [nosep,leftmargin=*]

  2. We propose a mathematically principled approach to achieve “backward information flow” within a deep convolutional network, leveraging the ideas proposed in the deconvolutional networks approach of Zeiler and Fergus (2014). However, this approach is computationally very expensive since it requires solving a sparse recovery problem for each layer of the network, and this limits the depth of a network on which this method is applicable. On the other hand, our approach only needs simple application of matrix adjoints and (element-wise) nonlinearities for each convolutional layer and can be easily implementable on very deep networks.

  3. We propose a systematic way of using the forward information to guide the backward-traversal. In particular, we use the forward information to extract a support within a given layer of representation that best corresponds to a specific feature map. We achieve this using a novel masking scheme which transparently combines both forward and backward information flows through the network.

  4. As opposed to gradient-based schemes (such as Selvaraju et al. (2016); Springenberg et al. (2014)) that aggregate the information from all feature maps, our algorithm produces binary support estimates layer by layer. In that sense, our method avoids any numerical stability and robustness issues that may arise via the well-known problem of exploding/vanishing gradients that can potentially affect the interpretability. In particular, in contrast with Shrikumar et al. (2017), we remove the need for any separate reference image, and our method only involves making two passes through the network for a given image.

We present preliminary numerical evidence supporting our method, and demonstrate advantages over gradient-based methods such as guided backpropagation 

Springenberg et al. (2014).

2 Proposed Approach: Forward-Backward Interpretability

We now describe our scheme for visualizing a convolutional neural network. We term our method

Forward-Backward Interpretability (or FBI for short). Given a test image, the goal is to identify important regions that explain the prediction of the learned network. To do this, we propagate the class-probed information back to the image pixel space through the complete network, using the guidance of learned model weights as well as the forward

activations of each neuron in the network.

Our approach shares several similarities with the deconvolutional networks (DeconvNet) approach introduced in Zeiler and Fergus (2014). However, instead of reconstructing lower layer feature maps from higher layer activations as in DeconvNet, we merely try to identify important regions (supports) preserved in the forward activations in each layer from the backward (class-specific) information flow.

Suppose that we have already trained the network to an optimal state. In the forward pass, an image is presented to the network, and the activations in the entire network are computed. To explain the classification, we consider the class indicator vector

where for the predicted class and zero otherwise, and back-propagate this information to the input space. We use as the input of the backward pass to approximately “invert” each layer, while iteratively filtering these inverses using the forward activations. The process is repeated until the input layer is reached.

Dense layers. For each fully-connected layer (indexed by ), denote its activation as:

and the softmax activation is achieved at the final () layer. Our goal is to traverse each of these layers backwards. In order to achieve this, we define the “adjoint” of each operation111The term “adjoint” is only loosely defined here due to the nonlinearities involved.

as follows. The adjoint of the softmax layer,

, is defined point-wise such that

The adjoint of the ReLU activation function is the ReLU itself. Overall, the “adjoint” of each fully connected layer is


Foward-Backward masking. The backward information flow is now filtered using the forward activations. Specifically, we only keep entries of such that their entry-wise product with respective entries in are above some threshold parameter . The other entries are set to zero otherwise. This enables us to identify a candidate support corresponding to an interpretable feature in the input of a given layer.

Contributing feature maps. Among many backward feature maps at the top convolutional layer, we keep only of them in the backward information flow. The contribution of each map is determined as the total activation of the entire map. Hence, the features irrelevant to the probed class are removed.


. To perform an adjoint of the max pooling layer, we reshape the obtained pooled map

from the backward pass and replicate (copy) its values across the domain of the max operator. Then, we evaluate entry-wise:

This is similar to an analogous unpooling operation in DeconvNet; however, that approach only copies the value of in a single location via switches that are stored in memory. In contrast, our new scheme allows the backward feature maps after unpooling to be not overly sparse, and retains enough spatial information about the interpretation. We note that the replication step is suitable for any downsampling filter of size

and stride 2 (wherein the receptive fields are non-overlapping). For other filter sizes and stride lengths, the replicated values are averaged over the overlapping locations.

Deconvolution. The deconvolution step is similar to DeconvNet, where we compute the adjoint by convolving the backward activation with the flipped filter weights of a corresponding filter.

As we traverse backwards through the network using the above operations, we successively retain a subset of pixel indices (or support) at the input of each layer that plausibly corresponds to the “interpretable” portion of the given image. In the end, we display the locations of these indices together with the values produced by the adjoint. The selectivity achieved by successive masking means that we always obtain a fairly sparse support in our final estimate; the sparsity can be controlled via appropriate choice of the threshold parameter .

3 Results

We visualize the interpretations provided by the proposed algorithm in Table 1

. We use VGG-16 model pretrained with ImageNet dataset 

Chollet et al. (2015). The visualizations are generated for the top 1 predictions of the image. The choice of top filters for computing the inverse is an important factor contributing to the interpretation obtained. If the value of is very less (around 10% of the filters), then the interpretations obtained looses a lot of important features and when is too high (close to 100%) then we see that the interpretations are too noisy. We have found that the visualizations obtained while using 100% of the filters, and no pointwise thresholding based on the forward function value, produces an output that is close to the DeconvNet algorithm. Thus, using the forward function and masking the inverse computed as explained above, we achieve better output than the DeconvNet as well as the guided backpropagation algorithms.

For the experiments shown in the Table 1, we use the top 50% filters to propagate the inverse of each layer. We also use a thresholding value of 10.0 for the pointwise mask between the forward and backward values. We compare our method with the guided backpropagation algorithm. We notice that the resulting visualizations have lesser noise compared to that of the guided backpropagation algorithm.

=.1in =2.5in Input FBI Interpretations {our method} tabby cat boxer jaguar fire hydrant Guided Backprop Visualizations tabby cat boxer jaguar fire hydrant

Table 1: Illustrative examples of the forward-backward inversion visualizations.

4 Conclusions

In this work, we introduce a novel Forward-Backward approach for visualizing the interpretations that correspond to a particular class. However, we see that the choice of filters for computing the interpretations has become a hyper-parameter to ensure that the interpretations are good. It is also seen that the filters which contribute to one particular class might have similar activations to the activations pertaining to some other class. Hence, decoupling the activations of these filters to choose which filters to use for computing the inverse, so that we maintain class discriminativeness, is something of much interest to the authors.