Contains the jupyter notebooks to reproduce the results of the paper "Concept Saliency Maps to Visualize Relevant Features in Deep Generative Models" https://arxiv.org/pdf/1910.13140.pdf
Evaluating, explaining, and visualizing high-level concepts in generative models, such as variational autoencoders (VAEs), is challenging in part due to a lack of known prediction classes that are required to generate saliency maps in supervised learning. While saliency maps may help identify relevant features (e.g., pixels) in the input for classification tasks of deep neural networks, similar frameworks are understudied in unsupervised learning. Therefore, we introduce a new method of obtaining saliency maps for latent representations of known or novel high-level concepts, often called concept vectors in generative models. Concept scores, analogous to class scores in classification tasks, are defined as dot products between concept vectors and encoded input data, which can be readily used to compute the gradients. The resulting concept saliency maps are shown to highlight input features deemed important for high-level concepts. Our method is applied to the VAE's latent space of CelebA dataset in which known attributes such as "smiles" and "hats" are used to elucidate relevant facial features. Furthermore, our application to spatial transcriptomic (ST) data of a mouse olfactory bulb demonstrates the potential of latent representations of morphological layers and molecular features in advancing our understanding of complex biological systems. By extending the popular method of saliency maps to generative models, the proposed concept saliency maps help improve interpretability of latent variable models in deep learning. Codes to reproduce and to implement concept saliency maps: https://github.com/lenbrocki/concept-saliency-mapsREAD FULL TEXT VIEW PDF
Latent representations are the essence of deep generative models and
Interpretation and improvement of deep neural networks relies on better
As Deep Neural Network models for face processing tasks approach human-l...
Backpropagation image saliency aims at explaining model predictions by
Conventional saliency prediction models typically learn a deterministic
Since collecting pixel-level groundtruth data is expensive, unsupervised...
Convolutional neural networks utilize a hierarchy of neural network laye...
Contains the jupyter notebooks to reproduce the results of the paper "Concept Saliency Maps to Visualize Relevant Features in Deep Generative Models" https://arxiv.org/pdf/1910.13140.pdf
A rapidly increasing amount of unlabeled data, such as images and molecular data, has prompted a rise of deep generative models, that can be trained without human supervision. By using a vast amount of unlabeled data, unsupervised learning models such as variational autoencoders (VAEs) [Kingma2014, Rezende2014] extract low-dimensional latent spaces that compactly encode high-dimensional input data and potentially reveal hidden relationships. While deep generative models are capable of generating new images [Salimans2015, kulkarni2015deep, mescheder2017adversarial] and enable manipulation of image-specific attributes [Larsen2016, higgins2017beta], it remains a grand challenge to achieve intelligible understanding of their behavior [lipton2016mythos, kim2017interpretability, Adebayo2018]. We are interested in understanding and interpreting the latent representations of high-level concepts in generative models. Using VAEs, we propose to evaluate the importance of input features with respect to concept vectors and provide a new method of obtaining concept saliency maps. This essentially extends the popular method of saliency maps in classification tasks to unsupervised learning.
In predicting known classes using convolutional neural networks (CNNs), saliency maps have been introduced as a natural approach to make models interpretable[erhan2009visualizing, baehrens2010explain, Simonyan2014]. A saliency map visualizes relative importances of the input pixels with respect to the classes that the neural network has been designed and trained on. It gives an insight into the behavior of the model that leads to a certain prediction. Saliency maps are obtained by calculating the gradient of the class score with respect to the input pixels , where
is usually taken to be the activation of the neuron in the output layer encoding the class of interest. They attempt to answer the question: “Which pixels were decisive for this particular classification made by the model?” The originally proposed method of obtaining the gradients may give noisy saliency maps, which prompted a number of improvements and variants on the calculation and backpropagation of the gradients[Springenberg2015, Smilkov2017SmoothGradRN, sundararajan2017axiomatic].
This work aims to generalize the method of saliency maps to be applicable in generative models, particularly in deep latent variable models such as variational autoencoders (VAEs). VAEs are among the most popular approaches in unsupervised learning of complex distributions [Kingma2014, Rezende2014]. They have been demonstrated to be capable of generating complicated imagery such as handwritten digits [Kingma2014, Salimans2015], faces [kulkarni2015deep, mescheder2017adversarial], and others. Furthermore, it has been shown that VAEs learn a meaningful latent representation which allows the manipulation of attributes through traversal in latent space [Larsen2016, higgins2017beta].
To achieve applicability to VAE and other generative models, which naturally lack known classes, we propose to use concept vectors to compute concept scores , which can be understood as a replacement for (Fig. 1). A concept vector is a latent representation of a high-level concept, which could be known attributes [Larsen2016, White2016], cluster memberships [Huang2016], or others. Such a concept vector has been demonstrated to be capable of manipulating an image by adding a certain attribute, as is demonstrated in Fig. 2 [Larsen2016, White2016]. Once a generative model is trained on some dataset , a concept vector is readily obtained by averaging over the latent representations of samples containing an attribute of interest and subtracting the average of samples which do not. The concept score is then obtained by measuring the similarity of the latent representation of an input image and the concept vector corresponding to that attribute , where is the encoder of the VAE.
Our proposed method obtains input-specific saliency maps for generative models by weighing the input pixels by relevance with respect to this concept vector. In other words, we answer the question: “Which parts of a given image are particularly relevant for this concept?”. This concept saliency map is obtained by calculating the dot product between the latent representation of a certain image and the concept vector, although a different method to calculate the concept score may be useful in other domains. The concept score
is motivated by the intuition that two vectors with a high dot product are more aligned, and thus more similar, than ones with a low dot product. Therefore, it can be seen as analogous to a neuron in the prediction vector of a classifier.
In this sense our proposed method is a generalization of saliency maps to generative models and is not limited to a classifier’s possible predictions. The user is free to utilize known attributes or construct novel concepts based on the data. Utility of the concept saliency maps are demonstrated through application to a large-scale face image database, CelebA [liu2015faceattributes] and spatial transcriptomic (ST) data of a mouse olfactory bulb [staahl2016visualization].
Generative models, such as VAE, often operate on the assumption that there is a meaningful low-dimensional latent space, meaningful in the sense that it encodes high-level concepts. Identifying, estimating, and disentangling such latent space has become important due to their potential to explain how unsupervised learning works[Reed2015, Larsen2016, White2016, kim2017interpretability]. Sampling from latent space may allow us to reconstruct a meaningful output and to approximate visual analogies [Reed2015]. In [Larsen2016], a latent representation of a certain attribute, such as a smile, is obtained and added to an arbitrary image which was referred to as visual attribute vectors (e.g., Fig. 2). They noticed a strong correlation among certain attributes such as “heavy makeup” and “wearing lipstick”. Reference [White2016] attempted to decouple correlated attributes when a correlation stems from a sampling bias. Note that in [White2016], the dot product of a latent representation of an image with a concept vector was used to build an effective binary classifier, which motivated our definition of the concept score .
High-level concepts in latent space can be further discovered in semi- or unsupervised manner. Reference [Hong2015] uses a limited reference database of images with attributes to cluster and annotate a set of unlabeled input images. By coupling a CNN with a set of data-dependent binary attributes, [Huang2016] seeks to automatically discover image attributes. Internal states of a deep neural network may be interpreted in terms of high-level concepts, which [kim2017interpretability] called concept activation vector. This enables one to determine how important a certain internal concept, such as stripes, is for the prediction of a class, for example zebra. We are interested in both known and unknown attributes which may be learned from the input data. The novelty of our approach is twofold: firstly, the applicability to latent variable models leveraging low-dimensional latent space and secondly, the usage of the dot products obtained from concept vectors to create saliency maps.
In supervised learning there exist many methods to attribute the prediction of a network to its input features. Saliency maps are defined as the gradient of the class score with respect to the input pixels [erhan2009visualizing, baehrens2010explain, Simonyan2014]. Despite early successes, direct calculation of the gradients often leads to noisy saliency maps without clearly focused regions. Several propositions have been made to improve them, such as Guided Backropagation (GuidedBP) [Springenberg2015], Rectified Gradient (RectGrad) [Kim2019] and SmoothGrad [Smilkov2017SmoothGradRN]
. The first two methods modify the back-propagation of the gradient through the Rectified Linear Activation Unit (ReLU), for a detailed description see sec. III-B. SmoothGrad seeks to denoise the saliency map of a given image by sampling similar images by adding noise and averaging over the saliency of the sampled images. Integrated Gradients [sundararajan2017axiomatic]
computes interpolations between a given image and a baseline image and integrates the saliency maps of these interpolated images which essentially alleviates the sensitivity to the saturation of the input pixels. Other methods such as Layer-wise Relevance Propagation (LRP)[bach2015pixel, samek2016evaluating] and DeepLift [shrikumar2017learning] are relevance score based techniques, which means that they propagate back the relevance score(which is equal to in the final layer) via the activations of the previous layers without using gradients.
Our work essentially extends this family of methods to embrace unsupervised learning in which the class score is no longer available. We propose to extract the latent layer of VAE and related generative models and feed the dot products as the class score into any of the aforementioned methods.
In the context of supervised classifiers, conventional approaches to obtaining saliency maps calculate the gradient of the output neuron encoding the class of interest, i.e. the class score , with respect to the input pixels . The gradient tells us which inputs need to be changed the least to have the biggest influence on the class score, essentially identifying the most significant inputs. To find the input pixels which are most significant in maximizing the class score one can simply clip all negative gradients. In unsupervised models we do not have a prediction vector to choose our class score from but instead a low dimensional latent space which reflects the input data and it is a priori not clear how to construct a saliency map from it.
Recent developments of variational Bayesian approaches resulted in variational autoencoder (VAE) [Kingma2014, Rezende2014] and related methods that estimate meaningful latent spaces. Briefly, the general architecture of VAE consists of the encoder and the decoder, both of which consist of multiple layers of neural networks. The encoder, which typically performs drastic dimension reduction, compresses the observed data (the input) into the latent variables, while the decoder reconstructs the observed data from the latent variables (Fig. 1). The input data are generated by some latent variables z with a prior distribution . Then, X is realized from a conditional distribution , where and z are unknown. We attempt to do inference in this model such that given X, find z, by calculating the posterior density , which is intractable since we would have to integrate over all latent variables .
VAE circumvents this challenge by fitting an approximate inference model which is an approximation to the true distribution . An iterative learning process jointly learns the parameters and . The probabilistic encoder and decoder are constructed of neural networks with layers. In images and other spatial data, convolutional neural networks are often used to take advantage of spatiality [Fukushima1980, Lecun1998]
. To train the network, weights are adjusted through gradient descents to minimize a pre-specified loss function.
The latent space is thought to encode high-level concepts, such as facial attributes or morphological structures. Many of unsupervised learning techniques aim to disentangle this latent space, extract useful latent representations of high-level concepts, and present how they are manifested in the observed (input) data. Larsen [Larsen2016] and others have shown that we can find directions, or concept vectors, in latent space, which encode a certain attribute. Instead of a class score, for generative models we propose to use the dot product of and a certain concept vector . We refer to this dot product between and as concept score.
Our method to obtain saliency maps from concept vectors is summarized in the following algorithm.
Algorithm 1. Concept Saliency Maps
Train VAE on the observed data
Obtain the set of latent representations by applying the encoder on X
Identify the data attribute of interest (e.g. ‘smiling’) and denote those samples with the data attribute of interest by and without it by
Obtain concept vectors as follows, where and refer to the number of samples with and without a certain attribute
Obtain the concept score by calculating the dot product between and
Calculate the gradient of the concept score with respect to input pixels to generate saliency map
When the computationally intensive Step 1 of training VAE is previously completed, the trained model can be used to compute the concept scores of interest. A concept can be defined by annotations of the dataset or be user defined. In the latter case the user can choose a subset of images that represent a certain concept and obtain a concept vector by averaging as is explained in Step 4. Another way is to cluster the data, possibly in latent space, and use cluster membership as concept or, as is exemplified in Sec. IV-B, a concept can be formed by highly correlated samples. It is data dependent which method is most suitable.
While we have focused on the dot product in the Step 5, it is possible to adapt modifications and improvements to how the concept score is derived. Dependent on data types and analysis goals, one may explore cosine similarity coefficients,norms and others and assess how well the input data with and without certain attributes are separated. The last step of computing the gradients, which is an active area of research, is explained in detail below.
Our method can readily be used to explore and visualize the latent space in pre-trained VAE models, as long as one is able to extract the latent representation of the input samples.
There are different methods of calculating the gradients which are used to obtain saliency maps. They differ in how they handle the backpropagation of the gradient through the rectified linear activation unit (ReLU), which is used throughout in our neural networks and is defined as . Suppose we have a -layer densely connected network and denote the input of neuron in layer as and the weight between neurons as . By larger we denote “deeper” layers, such that is the input layer and the output layer. The output layer in a classification task would contain the class score, whereas in the generative models, we use the proposed concept score. The gradient is defined as and tells us how the class score or the concept score changes with respect to , where
could for example be the values of the input pixels. Using the chain rule one finds the following relation for the backpropagation of gradients:
where is the indicator function. The activation maps created using this backpropagation are known to be very noisy and therefore improved methods have been proposed. Guided Backpropagation [Springenberg2015] further demands that the gradients of the higher layer have to be positive in order to be propagated:
and this additional guiding of the gradient leads to sharper activation maps. Recently, Rectified Gradient [Kim2019] has been proposed as yet another method to calculate gradients. It introduces an external parameter which acts as a threshold for gradients to be backpropagated:
Our proposed framework also works with methods of computing the gradients since it only replaces with leaving anything else untouched. In the following applications, we present both Guided Backpropagation and Rectified Gradient which have their own advantages and disadvantages.
We apply the proposed method of concept saliency maps to two datasets: a large-scale face database with attributes, CelebA, [liu2015faceattributes] and a spatial transcriptomic (ST) dataset of a mouse olfactory bulb [staahl2016visualization]. In the well-known CelebA, there exist annotations of facial features which can be used to visualize and evaluate. We demonstrate how known attributes such as smiles and glasses can be used to generate concept saliency maps. Spatial transcriptomics (ST) is a novel technique to measure gene expression profiles of a sample while maintaining spatial information. We use this spatial map of gene expression values located in a grid to demonstrate how to create saliency maps with limited prior knowledge about the input samples.
CelebA is a large database of face images with known attributes [liu2015faceattributes], consisting of 202593 images and 40 binary annotations of facial attributes for each image. The images have been aligned, scaled and cropped to pixels using the landmark annotations that come with the dataset [liu2015faceattributes]
. We used a VAE with convolutional layers, ReLu activation and batch normalization, down- and upsampling is done using strides and the dimension of the latent layer is 400. For the full architecture please refer to Tab.I
. The network has been trained for 50 epochs using the Adam optimizer[kingma2014adam] with learning rate set to 0.001.
|Encoder||Output size||Decoder||Output size|
|Input image||400 fully-connected (latent layer)||400|
|64 conv., stride 2, BN, ReLu||fully-connected, BN, ReLu||16384|
|128 conv., stride 2, BN, ReLu||512 conv., stride 2, BN, ReLu|
|256 conv., stride 2, BN, ReLu||256 conv., stride 2, BN, ReLu|
|512 conv., stride 2, BN, ReLu||128 conv., stride 2, BN, ReLu|
|1024 conv., stride 2, BN, ReLu||64 conv., stride 2, BN, ReLu|
|512 fully-connected, BN, ReLu||512||3 conv., stride 2, BN, ReLu|
|400 fully-connected (latent layer)||400|
The saliency maps shown in Fig. 3 were obtained by employing the algorithm in section III-A using 2000 images for each attribute for averaging, where the annotations of the CelebA dataset for facial attributes such as “smiling” or “wearing hat” have been used. The saliency maps often match with the intuition that we have, they highlight eyebrows, the hat and the blue square clearly. For the smiling concept GuidedBP focuses rather on the mouth and chin region whereas RectGrad on teeth and cheeks. It is apparent that RectGrad produces cleaner maps than GuidedBP which is due to the additional threshold introduced. This threshold prevents RectGrad from performing a partial image recovery, which is the case for GuidedBP, as has recently been proven in [nie2018theoretical].
However, note that not all attributes are created equal. First, there are continuous attributes – e.g., “rosy cheeks” or “oval face” – that are very challenging even for humans to agree upon. Second, some attributes – e.g., “attractive” or “young” – are highly subjective and do not necessary have common visual features. Third, there are correlated attributes – e.g., “heavy makeup” and “wearing lipstick” – whose underlying latent representations are intertwined. We have focused on attributes that are seemingly separated and well-represented. Fourth, we found that attributes that are inherently dark often are not visualized in saliency maps. For example, the eyeglasses are problematic because, although the correct region of the face is highlighted, the glasses themselves remain dark (Fig. 3). This problem is connected to the calculation of the gradients which fails for very dark regions of the images. Reversely, bright regions tend to be overrepresented in the saliency maps as can be seen in the “wearing hat” example, where the teeth and parts of the racket are also highlighted.
To further demonstrate that this is an effect due to the calculation of the gradients and not connected to the calculation of the concept score we have inserted also black squares into the pictures and calculated the dot products of the concept vector for blue and black square with the latent vectors of 20000 images in each case, where half of the images contained either a blue or black square and the remaining ones did not. Plotting the histograms shows that in both cases the dot product is significantly higher if a square is present but the saliency map fails to highlight the black square as the dominant feature (Fig. 4). In both cases the face almost vanishes, but in the black square case only the edges are partly highlighted.
We are interested in understanding how high level concepts, such as morphological structures, are manifested on spatial gene expressions, which are measured by fixating and staining a sliced tissue sample. Particularly, the ST data of a mouse olfactory bulb contains genome-wide gene expressions from a sliced tissue section of a mouse olfactory bulb [staahl2016visualization]. When positioned on a microchip, there are 267 spots in a grid that measure expression activities of up to 16573 genes within that locality. Each of 16573 genes is treated as a sample with 267 features, which are the counts of RNAs at the different spots in the tissue.
To provide a context for gene expression data in a spatially resolved tissue, a microscopic image of the tissue obtained by Hematoxylin-and-eosin staining has been superimposed with the normalized counts for the gene Penk (Fig. 5). Clearly, Penk – which is a proenkephalin gene playing a role in signaling receptor binding, response to stimulus, and cell projection – is most highly expressed in the inner part of the tissue section known as the granular cell layer. We are interested in investigating the spatial expression of genes with known and unknown relations to morphological structures.
To prepare the data for VAE, the gene counts have been arranged in matrices and normalized for each gene separately (see middle row in Fig. 6). The VAE architecture used for this ST dataset is similar to the one used in CelebA, but with fewer layers and filters; for a detailed description see Tab. II.
|Encoder||Output size||Decoder||Output size|
|Input image||20 fully-connected (latent layer)||20|
|16 conv., stride 2, BN, ReLu||fully-connected, BN, ReLu||1024|
|32 conv., stride 2, BN, ReLu||32 conv., stride 2, BN, ReLu|
|64 conv., stride 2, BN, ReLu||16 conv., stride 2, BN, ReLu|
|256 fully-connected, BN, ReLu||512||1 conv., stride 2, BN, ReLu|
|20 fully-connected (latent layer)||400|
In contrast to CelebA, this ST data do not come with conventional attributes that could be used to form concept vectors. Nonetheless, the advantage and motivation of ST is that gene expression profiles at different locations in a tissue section are crucial for complex molecular systems. How are genes expressed differentially across a tissue section? Reference [staahl2016visualization] has shown that some genes present clear spatial organizations such as Penk, Doc2g and Kctd12. These genes are specifically over-expressed in certain regions – Penk in the granular cell layer (GCL), Doc2g in the glomerular layer (GL) and Kctd12 in the outer layer (see Fig. 2(b) in [staahl2016visualization]). Therefore, we use these known genes to identify concept vectors for 3 morphological layers. Particularly, we calculated Pearson correlations in the latent space between these three genes and all other genes. Concept vectors related to these morphological layers were approximated by averaging over 50 genes with the highest correlation statistics (Fig. 6).
We applied the proposed methods using these concept vectors to available genes in the ST data. In Fig. 6 (bottom row) the saliency maps for the genes Itm2b, Syt7 and Apoe with respect to the concept vectors in the top row are shown. It appears that the saliency maps indeed highlight the spots of the gene count matrices which match with the corresponding morphological layers. For instance, the saliency map for Itm2b focuses on the granular cell layer (inner parts) and lets the glomerular and olfactory nerve layer (outer layers) almost vanish.
Unsupervised deep learning is becoming more critical as we are accumulating a greater amount of unlabeled data. One of the main goals of unsupervised learning is to extract meaningful and useful latent representations of known and novel concepts. To this end, we have developed a method of calculating concept scores and obtaining concept saliency maps which help identifying important regions or features in input data. Briefly, the concept score is defined as the dot product between the latent representation of an input and a concept vector. This proposed concept score can be understood to be analogous to the activation of a neuron in the prediction vector of a classifier, and our method therefore generalizes the technique for obtaining saliency maps to generative models.
The effectiveness of our method is demonstrated by utilizing the CelebA dataset. When using known attributes, the concept scores are shown to effectively distinguish samples with and without a certain attribute and the concept saliency maps highlight relevant facial features. However, we have observed and investigated how very dark regions in an image tend not to be highlighted, even though they are crucial to the concept. It has been demonstrated that this is an effect due to the calculation of the gradients and not connected to the proposed concept score. Therefore, it would be interesting to further investigate how darkness impacts saliency maps and develop an improved gradient method or normalization technique that would not be biased.
We have further presented a novel application of proposed methods to spatial transcriptomics (ST) of a mouse olfactory bulb [staahl2016visualization]. Concept vectors have been formed by using genes which are highly correlated to three well-known genes whose spatial expression profiles are known to coincide with certain morphological layers. Concept vectors for different morphological layers have been obtained from genes with high correlation statistics relative to genes with known spatial expressions. Finally, the concept saliency maps highlight the regions in the spatial gene expressions which most strongly overlap with the morphological layers.
We plan to further develop the proposed methods using ST datasets in order to better understand how deep generative models could be used to disentangle the latent space of spatial gene expression. It would be instructive to explore functional annotations and other high-level concepts that are not directly related to morphology. These annotations may be used to reveal spatial characteristics for certain functions and saliency maps visualize the contribution of genes to these concepts. Furthermore, it would be interesting to explore other ST datasets, many of them related to cancer and other medical applications. Lastly, it will become more important to combine ST with RNA-sequencing and molecular experiments to validate the computational results.
Finally, saliency maps, which have primarily been used for interpretation of machine learning models, have the potential to become more useful and popular as an essential tool for exploratory data analysis. High-level concepts hidden in latent space of VAE and other generative models may be discovered through dimension reduction and clustering and saliency maps can be used to reveal the significant features. In particular, it would be important to extend this set of unsupervised deep learning approaches to non-image data for molecular and biomedical applications.