In this paper we address the problem of visual attribution, which we define as detecting and visualising evidence of a particular category in an image. Pinpointing all evidence of a class is important for a variety of tasks such as weakly supervised localisation or segmentation of structures [43, 45, 67], and better understanding disease effects, and physiological or pathological processes in medical images [69, 18, 12, 19, 13, 28, 56, 31, 32, 65].
Currently, the most frequently used approach to address the visual attribution problem is training a neural network classifier to predict the categories of a set of images and then following one of two strategies: analysing the gradients of the prediction with respect to an input image [28, 5, 56] or analysing the activations of the feature maps for the image [67, 43, 45] to determine which part of the image was responsible for making the associated prediction.
Visual attribution based directly on neural network classifiers may, under some circumstances, produce undesired results. It is known that such classifiers base their decisions on certain salient regions rather than the whole object of interest. It was recently shown that during training neural networks minimise the mutual information between input and output layers, thereby compressing the input features . These findings suggest that a classifier may ignore features with low discriminative power if stronger features with redundant information about the target are available. In other words, neural network training may be working in opposition to the goal of visual attribution. As a consequence, if there is evidence for a class at multiple locations in the image (such as multiple lesions in medical images) some locations may not influence the classification result and may thus not be detected. We demonstrate this effect on a synthetic dataset in our experiments.
It would be highly desirable if instead we could visualise evidence of a particular category in a way that captures all category-specific effects in an image. Our main contribution is a novel approach towards solving the visual attribution problem which takes a first step in this direction. In contrast to the majority of recent techniques, the method does not rely on a classifier but rather aims at finding a map that, when added to an input image of one category, will make it indistinguishable from images from a baseline category. To this end we propose a generative model in which the additive map is learned as a function of the images. The method is based on Wasserstein generative adversarial networks (WGAN) , which have the desirable property that they minimise an approximation of the Wasserstein distance between the distributions of the generated images and the real ones.
We note that our method does not tackle the classification problem but rather assumes that the category labels of the test images have already been determined (e.g. using a separately trained classifier or by an expert). Furthermore, the method requires a baseline category, which is not the case for many benchmark recognition datasets in vision, but is in fact the case for many practical detection applications, especially in medical image analysis.
We demonstrate the method on synthetic 2D data and on large 3D brain MR data, where we aim to predict subject-specific disease effect maps for Alzheimer’s disease (AD).
1.1 Medical motivation
Identifying disease effects at the subject-specific level is of great interest for various medical applications. In clinically oriented research, identifying subject-specific disease effects would be useful for stratification amongst the patient population and to help disentangling diseases such as AD  and Schizophrenia , that are believed to be composed of multiple sub-types rather than a single disease. Furthermore, for clinicians, subject-specific maps could be helpful in assessing disease status and grading.
In this paper, we chose to study the disease effects of AD with respect to mild cognitive impairment (MCI), which is characterised by a slight decline in cognitive abilities. Patients with MCI are at increased risk of developing AD, but do not always do. We evaluate our method on one of the largest publicly available neuroimaging datasets acquired by the Alzheimer’s Disease Neuroimaging Initiative (ADNI). We used the MCI population as the baseline category and the AD population as the category of interest. Our choice to use MCI as our baseline is motivated by the fact that the ADNI dataset contains a number of MCI subjects who convert AD with imaging data at both stages of the disease. This allowed us to evaluate the predicted disease effects against real observed effects defined as the differences between images at the different stages. Note that even though using normal controls as the baseline is feasible, it would have been much harder to assess the proposed method due to the small number of control to AD converters in the ADNI dataset.
2 Related work
2.1 Visual attribution
A commonly used approach for weakly supervised localisation or segmentation is to analyse the final feature map of a neural network classifier [43, 45]. The Class Activation Mapping (CAM) method  builds on those techniques by reducing the feature maps of the second to last layer using a global average pooling layer, followed by a dense prediction layer. This allows to create class-specific activation maps as a linear combination of the weights in the last layer.
A large amount of works on medical images builds on the CAM technique. Examples include the work of Feng et al.  on pulmonary nodule localisation in CT, the work of Ge et al.  on skin disease recognition. Other examples are , . It is important to note that CAM is restricted in the resolution of its visual attributions by the resolution of the last feature map. Consequently, often post-processing of the predictions is required [12, 13, 45]. In contrast, our proposed method can produce visual attributions at the resolution of the original input images.
Another class of techniques creates saliency maps by backpropagating back to the input image. Examples include Guided Backprop, Excitation Backprop , Integrated Gradients , meaningful perturbations .
Similar techniques have been applied in the domain of medical images. Jamaludin et al.  use the backprop-based saliency technique proposed by  to pinpoint lumbar degradations, and Baumgartner et al. [5, 6] use a variant of  to localise fetal anatomy. Gao and Noble  apply a similar approach to localise the fetal heart.
2.2 Statistical disease models
Statistical analysis of medical images for identifying disease effects has been an instrumental tool for various diseases and disorders [58, 48, 9] as well as other non-disease related factors [17, 40, 29, 61, 44, 57]3, 33, 63, 16, 41, 47, 14].
Recently, constructing subject-specific maps has received attention. Maumet et al. took a one-versus-all group analysis approach [39, 38], while Konukoglu and Glocker extracted subject-specific maps with predictive models and Markov Random Field restoration [31, 32].
The common drawback in the previous approaches is the need for registration. In order to compute disease effect maps, images of different subjects need to be non-rigidly aligned on a common template where statistical analysis can be performed. The non-rigid registration process brings additional uncertainty to the subject-specific maps. Our work, addresses this shortcoming and generates subject-specific disease effect maps without requiring registration.
2.3 Image generation using GANs
Generative adversarial images conditioned on an input image have been used in diverse applications such as video frame prediction 
, image super-resolution, image-translation across domains using paired  and unpaired  images, and pixel level domain adaptation [7, 51].
In the context of medical images, GANs have been applied to super-resolution in retinal fundus images , for semi-supervised cardiac segmentation , synthesising computed tomography images from MR images[42, 62] and intraoperative motion modelling . Although some of the above models use 3D data, the examined volumes are usually relatively small , or the networks operate in a patch-wise fashion . It is important to note that in the case of brain MR images of Alzheimer’s disease patients, the diagnostic information is only visible at a high resolution and cannot be determined by considering small local patches only. In this paper, we therefore tackle the challenge of processing large 3D volumes directly.
We demonstrate a limitation in current neural network based visual attribution methods using synthetic data.
We propose a novel visual attribution technique that can detect class specific regions more completely and at a high resolution.
To our knowledge, this is the first application of generative adversarial networks on large structural 3D data.
An implementation of the proposed method is publicly available here: https://github.com/baumgach/vagan-code.
3 Visual attribution using WGANs
3.1 Problem Formulation
Our goal is to estimate a map that highlights the areas in an image which are specific to the class the image belongs to. We formulate the problem for two classes, a baseline class and a class of interest. The formulation however, easily extends to the case of multiple classes of interest. We denote an image with and the distribution of images coming from class with and images from class with . In the case of medical application, could for example denote the set of images from a population with a certain disease and images of control subjects.
We formulate a problem as estimating a map function that, when added to an image from category , creates an image
which is indistinguishable from the images sampled from . Thereby, the map contains all the features which distinguish the input image from the other category. In the case of medical images, will by definition contain the effects of a disease visible in the images, i.e. a disease effect map.
We model the function
using a convolutional neural network, whose parameters we find using a WGAN.
3.2 Wasserstein GANs
In the GAN paradigm a generator function and a discriminator function (both neural networks) compete with each other in a zero-sum game . Given random noise as input, the generator tries to produce realistic images that fool the discriminator, while the discriminator tries to learn the difference between generated and real images.
Arjovski and Bottou pointed out a limitation in this paradigm which precludes a guarantee that the generated images will necessarily converge to the target distribution  (although in practice, with appropriate training methods, many impressive results were achieved 
). Wasserstein GANs are a modification to the classic GAN paradigm where the discriminator is replaced by a critic which does not have an activation function in its final layer and which is constrained to be a-Lipschitz function. WGANs have better optimisation properties and it can be shown that they minimise a meaningful distance between the generated and real distributions.
3.3 Constrained effect maps using WGANs
In this work we build on WGANs to find the optimal map generation function. In contrast to regular WGANs, we have a map generator function , which, during training, takes as input randomly sampled images from category rather than noise. tries to generate maps that, when added to , create images appearing to be from category . By trying to distinguish generated images from real images from category , the critic ensures that the generated maps are constrained to realistic modifications (see Fig. 2 for an overview). In the context of medical images, this means enforcing anatomically realistic modifications to the images.
Building on  this leads to the following cost function:
Optimising Eq. 2 directly could lead to changes in the input image that change the image identity. For instance, the brain anatomy of a subject could be changed to a degree where it does not only capture disease related changes but changes the subject identity. We want to encourage the smallest required map that still leads to a realistic . Thus add the following data regularisation term to the cost function:
where is the L1 norm .
The final optimisation is then given by
where is the set of 1-Lipschitz functions.
3.4 Network architecture
As we will discuss in more detail in Section 4.3, we design our proposed method with large 3D medical imaging data in mind, which often need to be processed at high resolutions in order to retain diagnostic information. Specifically, in our experiments on neuroimaging data, an input volume size of 128x160x112 voxels is used.
With such large images the limiting factor becomes storing the activations of the networks on GPU memory. With this in mind we design the map generator and the critic networks as follows.
3.4.1 Map generator network
The map generator function should be able to form an internal representation of the visual attributes that characterise the categories. In the case of brain images affected by dementia, it should be able to “understand” the systematic changes involved in the disease. Therefore, a relatively powerful network is required to adequately model the function . To this end, we use the 3D U-Net  (originally proposed for segmentation), as a starting point. The 3D U-Net has an encoder-decoder structure with a bottle-neck layer in the middle, but additionally introduces skip connections at each resolution level bypassing the bottle-neck. This allows the network to combine high-level semantic information (such as the presence of a structure) with low-level information (such as edges).
In order to reduce GPU memory consumption we reduce the number of feature maps by a factor of 4 in most layers. As in the original 3D U-Net  we use batch normalisation for all layers except the final one. The exact architecture is shown in Fig. 2 in the supplementary material.
3.4.2 Critic function
In line with related literature on image generation using GANs [27, 68, 51], we model our critic as a fully convolution network with no dense layers. We loosely base our architecture on the C3D network which achieved impressive results on action recognition tasks in video data by processing them directly in the spatio-temporal 3D space . However, in contrast to that work we only perform 4 pooling steps. After the fourth pooling layer we add another 3x3x3 convolution layer, followed by a 1x1x1 convolution layer which reduces the number of feature maps to one. The final critic prediction is given by a global average pooling operation of that feature map.
It proved important not to use batch normalisation for the critic network. Towards the beginning of training generating statistics of a batch with generated and the real images may not produce reasonable estimates, because the images vary considerably from each other. We surmise that this effect prevents the critic from learning when batch normalisation is used. A similar observation was made in . We also experimented with layer normalisation , but did not observe improvements.
The exact architecture we used is shown in Fig. 1 in the supplementary material.
To optimise our networks, we follow [2, 22] and update the parameters of the critic and map generator networks in an alternating fashion. In contrast to the regular GANs , WGANs require a critic which is kept close to optimality through-out training. We therefore perform 5 critic updates for every map generator update. Additionally, for the first 25 iterations and every hundredth iteration, we perform 100 critic updates per generator update.
With the above architectures, the maximum batch size that can be used for a single gradient computation on a Nvidia Titan Xp GPU with 12 GB of memory is 2+2 (real+generated). In order to obtain more reliable gradient estimates we aggregate the gradients for a total of 6 mini-batches before performing a training step.
We used the ADAM optimiser  to perform the update steps for all experiments. The optimiser parameters were set to , , and we used a learning rate of . Lastly, we used a weight of for the map regularisation term (see Eq. 4) throughout the paper. Training took approximately 24 hours on an Nvidia Titan Xp.
We evaluated the proposed method using a synthetically generated dataset and a large number of 3D brain MRI images from the publicly available ADNI dataset.
We compared our proposed visual attribution GAN (VA-GAN) to methods from the literature which have been used for visual attribution both on natural and on medical images. Specifically, we compared against Guided Backpropagation , Integrated Gradients  and Class Activation Mapping (CAM) . Furthermore, to verify that the WGAN framework is necessary, we also investigated an alternative way of estimating the additive map not based on GANs, which is described in detail in the next section.
All the methods except VA-GAN use classification networks. For simplicity, we used a very similar architecture for these networks as for the critic in VA-GAN, except for two differences: (1) we replaced the last convolution and the global average pooling layer by two dense layers followed by a softmax and (2) we used batch normalisation for all layers, which produced better classification results for the experiments on the ADNI dataset. In addition, for the CAM method we designed the last layer as described by 
and omitted the last two max pooling layers, which allowed significantly more accurate visual attribution maps due to the higher resolution of the last feature maps.
Lastly, for the experiments on the 2D synthetic data we simply replaced all 3D operations by 2D operations, but left the architectures otherwise unchanged.
4.1 Classifier-based map estimation
In the VA-GAN approach, we generate an additive map which is constrained by the critic to generate a realistic image from the opposite class. To demonstrate that this approach is necessary we also investigated an alternative method of estimating the additive map without a term enforcing realistic maps.
The alternative approach requires training a classifier and then optimising an additive map image that lowers the prediction as much as possible. That is to say, the image should minimise . This formulation is almost exactly the same as for the WGAN-based approach (see Eq. 1) except that is not a function of .
We need to use a regularisation in determining to avoid trivial solutions, such as imperceptible changes that can fool classifiers . A “well behaved” map can be found by the following minimisation problem:
Here indexes the pixels or voxels of . The L1 term weighted by encourages small maps, while the total variation term weighted by encourages smoothness.
We optimise this cost function using the ADAM optimiser using the default internal parameters given in  with a learning rate of and early stopping at 1500 iterations. Furthermore, we set , and in all experiments.
This approach is strongly related to the meaningful perturbation masks technique proposed by  in which parts of an image are locally deleted by a mask such that the prediction is minimised. In preliminary experiments we found that on the medical image problem we studied, visual attribution using destructive masks did not lead to the desired results. Deleting the diagnostic part of an image will not produce an image of the opposite class but rather an image with an undetermined diagnosis. This means such a mask may contain information about the location of diagnostic regions but not about specific disease effects, e.g. enlargement or shrinkage. In contrast, by optimising Eq. 5 we attempt to morph the image into the opposite class, such that diagnostic regions can be changed to have the characteristics of another class. Because of the similarity to , we refer to this method as additive perturbation maps.
4.2 Synthetic experiments
Data: In order to quantitatively evaluate the performance of the examined visual attribution methods, we generated a synthetic dataset of 10000 112x112 images with two classes, which model a healthy control group (label 0) and a patient group (label 1). The images were split evenly across the two categories. We closely followed the synthetic data generation process described in  where disease effects were studied in smaller cohorts of registered images.
The control group (label 0) contained images with random iid Gaussian noise convolved with a Gaussian blurring filter. Examples are shown in Fig. 3. The patient images (label 1) also contained the noise, but additionally exhibited one of two disease effects which was generated from a ground-truth effect map: a square in the centre and a square in the lower right (subtype A), or a square in the centre and a square in the upper left (subtype B). Importantly, both disease subtypes shared the same label. The location of the off-centre squares was randomly offset in each direction by a maximum of 5 pixels. This effect was added to make the problem harder, but had no notable effect on the outcome.
Evaluation: We split the data into a 80-20 training and testing set. Moreover, we used 20% of the training set for monitoring the training. Next, we estimated the disease effect maps for all cases from the synthetic patient class using the examined methods.
In order to assess the visual attribution accuracy quantitatively, we calculated the normalised cross correlation (NCC) between the ground-truth label maps and the predicted disease effect maps. The NCC has the advantage that it is not sensitive to the magnitude of the signals. For CAM we used only the positive values to calculate the NCC, while for the backprop-based techniques we used the absolute value, since those techniques do not necessarily predict the correct sign of the changes.
Results: A number of examples of the estimated disease effect maps are shown in Fig. 4. Guided Backpropagation produced similar results to Integrated Gradients. We therefore omitted it in the visual results due to space considerations but provide quantitative results.
For the backprop-based methods we consistently observed two behaviours: 1) They tended to focus exclusively on the central square which was always present and was thus the most predictive set of features. This behaviour is consistent with the network compressing away less predictive features discussed earlier . 2) They tended to focus mostly on the edges of the boxes rather than on the whole object. This may have to do with the fact that edges are more salient than other points and, again, are sufficient to predict the presence or absence of the box.
The CAM method managed to capture both squares most of the times, but by design had limited spatial resolution. Note that due to the lower number of max-pooling layers used for the CAM classifier each pixel in the last feature map had a receptive field of only 39x39 pixels. This could mean that many pixels in that feature map could not simultaneously see both of the squares, which may have contributed to the squares being better discerned. However, we did not investigate this further.
Lastly, our proposed VA-GAN method produced the most localised disease effect maps, finding the entire boxes and following the edges closely. It also managed to consistently identify both disease effects.
|Guided Backprop ||0.14||0.04|
|Integrated Gradients ||0.36||0.11|
The quantitative NCC results shown in Table 1 are mostly consistent with our qualitative observations, with VA-GAN obtaining significantly higher NCC than the other methods. The additive perturbation technique achieved a low score due to its exclusive focus on edges.
4.3 Experiments on real neuroimaging data
In this section, we investigate the methods’ ability to detect the areas of the brain which are involved in the progression from MCI to AD at a subject-specific level. We trained on images from both categories and then generated disease effect maps only for the AD images.
Data: We selected 5778 3D T1-weighted MR images from 1288 subjects with either an MCI (label 0) or AD (label 1) diagnosis from the ADNI cohort. 2839 of the images were acquired using a 1.5T magnet, the remainder using a 3T magnet. The subjects are scanned at regular intervals as part of the ADNI study and a number of subjects converted from MCI to AD over the years. We did not use these correspondences for training, however, we took advantage of it for evaluation as will be described later. An overview of the data is given in the supplemental materials in Section C.
All images were processed using standard operations available in the FSL toolbox  in order to reorient and rigidly register the images to MNI space, crop them and correct for field inhomogeneities. We then skull-stripped the images using the ROBEX algorithm . Lastly, we resampled all images to a resolution of 1.3 and normalised them to a range from -1 to 1. The final volumes had a size of 128x160x112 voxels.
Evaluation: We split the data on a subject level into a training, testing and validation set containing 825, 256 and 207 subjects, respectively. We then trained all of the algorithms with both AD and MCI data as described earlier, and generated disease effect maps for the AD subjects from the test set. The validation set was used to monitor the training.
In order to better understand the quality of the generated disease maps we estimated the actual deformations for a number of subjects as follows. We identified all subjects from the test set who were diagnosed with MCI during the baseline examination but progressed to AD in one of the follow-up scans. We then aligned those images rigidly and subtracted them from each other to obtain an observed disease effect map. We excluded all subjects which were not acquired with the same field strength, since a large amount of the observed effects could be due to differences in image quality. This left 50 subjects which we evaluated more closely. We note that even for the same field strength there are a number of artefacts due to intensity variations and registration. Furthermore, there are likely to be effects not caused by the disease, such as ageing (which will also be captured by our method), such that the observed disease effect maps could be considered a ground-truth.
Nevertheless, we also evaluated NCC between the observed and the predicted disease effect maps in the same manner as for the synthetic data.
Results: Fig. 5 shows disease effect maps obtained for a selection of AD subjects (we again omitted Guided Backprop in the figure). The subjects are ordered by increasing progression of the disease as measured by the ADAS13 cognition exam . It can be seen that VA-GAN’s predictions were in very good agreement with the observed effect maps. As is known from the literature [8, 11] the method indicates atrophy in the hippocampi, and general brain atrophy around the ventricles. Furthermore, it is known that in later stages of the disease other brain areas such as the temporal lobe get affected as well . Those effects were also identified by VA-GAN in the last subject in Fig. 5.
The backprop-based methods and additive perturbations were observed to be very noisy and tended to identify only the hippocampal areas. We believe that this is in agreement with the findings on the synthetic data. The hippocampus is known to be the most predictive region for AD, however, it is also known that many other regions are involved in the disease. It is likely, that classifiers learned to focus only on the most discriminative set of features ignoring the rest. Lastly, it is hard to interpret the results produced by CAM due to the low resolution. However, the images suggest that this method focuses on similar areas as the other methods.
Quantitative results are given in Table 2. VA-GAN obtained the highest correlation scores, however, it is hard to draw conclusions from these figures due to the noisy nature of the observed effect maps as well as the possible non-disease related effects on the observed effect maps, which are taken to be “ground-truth” in the experiments.
|Guided Backprop ||0.05||0.03|
|Integrated Gradients ||0.13||0.05|
We observed that VA-GAN generally produced very realistic deformations. In Fig. 6 a close-up of the MCI, AD, and generated image is shown for a sample subject. It can be seen that our method succeeded in making the generated image more similar to the corresponding MCI image and that the changes were realistic.
5 Limitations and discussion
We have proposed a method for visual feature attribution using Wasserstein GANs. It was shown that, in contrast to backprop-based methods, our technique can capture multiple regions affected by disease, and produces state-of-the-art results for the prediction of disease effect maps in neuroimaging data and on a synthetic dataset.
Currently, the method assumes that the category labels of the test data are known during test-time. In case they are unknown, the method could be easily combined with classifier which produces this information. We only evaluated the method for the case of two labels. More categories could be addressed by training multiple map generators each mapping to a background class (assuming there is one).
In the future, we plan to model other effects such as ageing or the presence or absence of certain genes on the ADNI data, investigate the method on other datasets and apply it to other problems such as weakly-supervised localisation.
We gratefully acknowledge the support of NVIDIA Corporation with the donation of a Titan Xp GPU.
-  M. Arjovsky and L. Bottou. Towards principled methods for training generative adversarial networks. arXiv preprint arXiv:1701.04862, 2017.
-  M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein GAN. arXiv preprint arXiv:1701.07875, 2017.
-  J. Ashburner and K. J. Friston. Why voxel-based morphometry should be used. Neuroimage, 14(6):1238–1243, 2001.
-  J. L. Ba, J. R. Kiros, and G. E. Hinton. Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
-  C. F. Baumgartner, K. Kamnitsas, J. Matthew, T. P. Fletcher, S. Smith, L. M. Koch, B. Kainz, and D. Rueckert. Sononet: Real-time detection and localisation of fetal standard scan planes in freehand ultrasound. IEEE Transactions on Medical Imaging, 2017.
-  C. F. Baumgartner, K. Kamnitsas, J. Matthew, S. Smith, B. Kainz, and D. Rueckert. Real-time standard scan plane detection and localisation in fetal ultrasound using fully convolutional neural networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 203–211. Springer, 2016.
-  K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, and D. Krishnan. Unsupervised pixel-level domain adaptation with generative adversarial networks. arXiv preprint arXiv:1612.05424, 2016.
-  H. Braak and E. Braak. Neuropathological stageing of Alzheimer-related changes. Acta neuropathologica, 82(4):239–259, 1991.
-  E. J. Burton et al. Cerebral atrophy in Parkinson’s disease with and without dementia: a comparison with Alzheimer’s disease, dementia with Lewy bodies and controls. Brain, 127(4):791–800, 2004.
-  Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger. 3D U-Net: learning dense volumetric segmentation from sparse annotation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 424–432. Springer, 2016.
-  B. C. Dickerson et al. The cortical signature of alzheimer’s disease: regionally specific cortical thinning relates to symptom severity in very mild to mild AD dementia and is detectable in asymptomatic amyloid-positive individuals. Cerebral cortex, 19(3):497–510, 2009.
-  X. Feng, J. Yang, A. F. Laine, and E. D. Angelini. Discriminative localization in CNNs for weakly-supervised segmentation of pulmonary nodules. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 568–576. Springer, 2017.
-  R. Fong and A. Vedaldi. Interpretable explanations of black boxes by meaningful perturbation. arXiv preprint arXiv:1704.03296, 2017.
M. Ganz et al.
Relevant feature set estimation with a knock-out strategy and random forests.NeuroImage, 122:131–148, 2015.
-  Y. Gao and J. A. Noble. Detection and characterization of the fetal heartbeat in free-hand ultrasound sweeps with weakly-supervised two-streams convolutional networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 305–313. Springer, 2017.
B. Gaonkar and C. Davatzikos.
Analytic estimation of statistical significance maps for support vector machine based multi-variate image analysis and classification.NeuroImage, 78:270–283, 2013.
-  E. Garrido, A. Castello, J. Ventura, A. Capdevila, and F. Rodriguez. Cortical atrophy and other brain magnetic resonance imaging (MRI) changes after extremely high-altitude climbs without oxygen. International journal of sports medicine, 14(04):232–234, 1993.
-  Z. Ge, S. Demyanov, R. Chakravorty, A. Bowling, and R. Garnavi. Skin disease recognition using deep saliency features and multimodal learning of dermoscopy and clinical images. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 250–258. Springer, 2017.
-  W. M. Gondal, J. M. Köhler, R. Grzeszick, G. A. Fink, and M. Hirsch. Weakly-supervised localization of diabetic retinopathy lesions in retinal fundus images. arXiv preprint arXiv:1706.09634, 2017.
-  I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014.
-  I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
-  I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved training of wasserstein GANs. arXiv preprint arXiv:1704.00028, 2017.
K. He, X. Zhang, S. Ren, and J. Sun.
Deep residual learning for image recognition.
Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
-  Y. Hu, E. Gibson, T. Vercauteren, H. U. Ahmed, M. Emberton, C. M. Moore, J. A. Noble, and D. C. Barratt. Intraoperative organ motion models with an ensemble of conditional generative adversarial networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 368–376. Springer, 2017.
-  J. E. Iglesias, C.-Y. Liu, P. M. Thompson, and Z. Tu. Robust brain extraction across datasets and comparison with publicly available methods. IEEE transactions on medical imaging, 30(9):1617–1634, 2011.
-  K. o. Iqbal. Subgroups of Alzheimer’s disease based on cerebrospinal fluid molecular markers. Annals of neurology, 58(5):748–757, 2005.
-  P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004, 2016.
-  A. Jamaludin, T. Kadir, and A. Zisserman. Spinenet: automatically pinpointing classification evidence in spinal MRIs. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 166–175. Springer, 2016.
-  R. Kanai and G. Rees. The structural basis of inter-individual differences in human behaviour and cognition. Nature Reviews Neuroscience, 12(4):231–242, 2011.
-  D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
-  E. Konukoglu and B. Glocker. Constructing subject-and disease-specific effect maps: Application to neurodegenerative diseases. In Medical Computer Vision and Bayesian and Graphical Models for Biomedical Imaging, pages 3–13. Springer, 2016.
-  E. Konukoglu and B. Glocker. SubCMap: subject and condition specific effect maps. arXiv preprint arXiv:1701.02610, 2017.
-  A. Krishnan et al. Partial least squares (PLS) methods for neuroimaging: a tutorial and review. Neuroimage, 56(2):455–475, 2011.
-  C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802, 2016.
-  M. Lin, Q. Chen, and S. Yan. Network in network. arXiv preprint arXiv:1312.4400, 2013.
-  D. Mahapatra, B. Bozorgtabar, S. Hewavitharanage, and R. Garnavi. Image super resolution using generative adversarial networks and local saliency maps for retinal image analysis. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 382–390. Springer, 2017.
-  M. Mathieu, C. Couprie, and Y. LeCun. Deep multi-scale video prediction beyond mean square error. arXiv preprint arXiv:1511.05440, 2015.
-  C. Maumet, P. Maurel, J.-C. Ferré, and C. Barillot. An a contrario approach for the detection of patient-specific brain perfusion abnormalities with arterial spin labelling. Neuroimage, 134:424–433, July 2016.
C. Maumet, P. Maurel, J.-C. Ferré, B. Carsin, and C. Barillot.
Patient-specific detection of perfusion abnormalities combining within-subject and between-subject variances in Arterial Spin Labeling.Neuroimage, 81:121–130, Nov. 2013.
-  D. Miller and J. O’Callaghan. Effects of aging and stress on hippocampal structure and function. Metabolism, 52:17–21, 2003.
B. Mwangi, T. S. Tian, and J. C. Soares.
A review of feature reduction techniques in neuroimaging.Neuroinformatics, 12(2):229–244, 2014.
-  D. Nie, R. Trullo, J. Lian, C. Petitjean, S. Ruan, Q. Wang, and D. Shen. Medical image synthesis with context-aware generative adversarial networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 417–425. Springer, 2017.
M. Oquab, L. Bottou, I. Laptev, and J. Sivic.
Is object localization for free?-weakly-supervised learning with convolutional neural networks.In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 685–694, 2015.
-  J. S. Peper, R. M. Brouwer, D. I. Boomsma, R. S. Kahn, H. Pol, and E. Hilleke. Genetic influences on human brain structure: a review of brain imaging studies in twins. Human brain mapping, 28(6):464–473, 2007.
-  P. O. Pinheiro and R. Collobert. From image-level to pixel-level labeling with convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1713–1721, 2015.
-  A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
-  M. Rahim et al. Integrating multimodal priors in predictive models for the functional characterization of Alzheimer’s disease. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015, pages 207–214. Springer, 2015.
-  H. Rosas et al. Regional and progressive thinning of the cortical ribbon in Huntington’s disease. Neurology, 58(5):695–701, 2002.
-  W. G. Rosen, R. C. Mohs, and K. L. Davis. A new rating scale for Alzheimer’s disease. The American journal of psychiatry, 1984.
-  C. A. Ross, R. L. Margolis, S. A. Reading, M. Pletnikov, and J. T. Coyle. Neurobiology of schizophrenia. Neuron, 52(1):139–153, 2006.
-  A. Shrivastava, T. Pfister, O. Tuzel, J. Susskind, W. Wang, and R. Webb. Learning from simulated and unsupervised images through adversarial training. arXiv preprint arXiv:1612.07828, 2016.
-  R. Shwartz-Ziv and N. Tishby. Opening the black box of deep neural networks via information. arXiv preprint arXiv:1703.00810, 2017.
-  K. Simonyan, A. Vedaldi, and A. Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
-  S. M. Smith, M. Jenkinson, M. W. Woolrich, C. F. Beckmann, T. E. Behrens, H. Johansen-Berg, P. R. Bannister, M. De Luca, I. Drobnjak, D. E. Flitney, et al. Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage, 23:S208–S219, 2004.
-  J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806, 2014.
-  M. Sundararajan, A. Taly, and Q. Yan. Axiomatic attribution for deep networks. arXiv preprint arXiv:1703.01365, 2017.
-  P. M. Thompson, T. D. Cannon, K. L. Narr, T. Van Erp, V.-P. Poutanen, M. Huttunen, J. Lönnqvist, C.-G. Standertskjöld-Nordenstam, J. Kaprio, M. Khaledy, et al. Genetic influences on brain structure. Nature neuroscience, 4(12):1253–1258, 2001.
-  P. M. Thompson et al. Cortical change in Alzheimer’s disease detected with a disease-specific population-based brain atlas. Cerebral Cortex, 11(1):1–16, 2001.
-  D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3D convolutional networks. In Proceedings of the IEEE international conference on computer vision, pages 4489–4497, 2015.
-  V. L. Villemagne, M. T. Fodero-Tavoletti, C. L. Masters, and C. C. Rowe. Tau imaging: early progress and future directions. The Lancet Neurology, 14(1):114–124, 2015.
-  K. E. Watkins, F. Vargha-Khadem, J. Ashburner, R. E. Passingham, A. Connelly, K. J. Friston, R. S. Frackowiak, M. Mishkin, and D. G. Gadian. MRI analysis of an inherited speech and language disorder: structural brain abnormalities. Brain, 125(3):465–478, 2002.
-  J. M. Wolterink, A. M. Dinkla, M. H. Savenije, P. R. Seevinck, C. A. van den Berg, and I. Išgum. Deep MR to CT synthesis using unpaired data. In International Workshop on Simulation and Synthesis in Medical Imaging, pages 14–23. Springer, 2017.
-  K. J. Worsley et al. Characterizing the response of PET and fMRI data using multivariate linear models. NeuroImage, 6(4):305–319, 1997.
-  J. Zhang, Z. Lin, J. Brandt, X. Shen, and S. Sclaroff. Top-down neural attention by excitation backprop. In European Conference on Computer Vision, pages 543–559. Springer, 2016.
-  Q. Zhang, A. Bhalerao, and C. Hutchinson. Weakly-supervised evidence pinpointing and description. In International Conference on Information Processing in Medical Imaging, pages 210–222. Springer, 2017.
-  Y. Zhang, L. Yang, J. Chen, M. Fredericksen, D. P. Hughes, and D. Z. Chen. Deep adversarial networks for biomedical image segmentation utilizing unannotated images. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 408–416. Springer, 2017.
B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba.
Learning deep features for discriminative localization.In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2921–2929, 2016.
-  J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv preprint arXiv:1703.10593, 2017.
-  W. Zhu, Q. Lou, Y. S. Vang, and X. Xie. Deep multi-instance networks with sparse label assignment for whole mammogram classification. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 603–611. Springer, 2017.
Appendix A Network architectures
In this section we describe the exact network architectures used for the 3D VA-GAN. We present the critic and map generator functions as Python-inspired pseudo code, which we found easier to interpret than a graphical representation. The layer parameters are specified as arguments to the layer functions. Unless otherwise specified all convolutional layers used a stride of 1x1x1 and a rectified linear unit (ReLU) non-linearity.
The architecture of the critic function is shown in Fig. 7. The conv3D_layer function performs a regular 3D convolution without batch normalisation and the global_averagepool3D function performs an averaging over the spatial dimensions of the feature maps.
The architecture for the map generator function is shown in Fig. 8. Here, the conv3D_layer_bn is a 3D convolutional layer with batch normalisation before the nonlinearity. The deconv3D_layer_bn learns an upsampling operation as in the original U-Net and also uses batch normalisation. Lastly, the crop_and_concat_layer implements the skip connections across the bottleneck by stacking the feature maps along the dimension of the channels.
Note that the architectures for the 2D experiments on synthetic data were identical, except all 3D operations were replaced by their 2D equivalents.
Appendix B Close-up analysis of VA-GAN
In Fig. 9 we present a larger view of all three orthogonal planes for an additional subject. In order to allow for an enlarged view, we only include the results obtained by VA-GAN and the actual observed changes from MCI to AD. As before it can be seen that VA-GAN produced visual attribution maps that very closely approximate the observed deformations. In particular, we note that for this subject VA-GAN correctly predicted a smaller disease effect in the left hippocampus compared to the right hippocampus.
Appendix C Details of MR brain data cohort
The MR brain image data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.
Specifically, we used T1-weighted MR data from the ADNI1, ADNIGO and ADNI2 cohorts which were acquired in with a mixture of 1.5T and 3T scanners. The data consisted of 5770 images, acquired from 1291 subjects. The images for each subject were acquired at separate visits that were spaced in regular intervals from 6 months to one year and usually spanned multiple years. On average each subject was scanned 4.5 times. The cohort consisted of 496 female and 795 male subjects. 2839 of the images were acquired using a 1.5T magnet, the remainder using a 3T magnet. The distribution of the ages at which the images were acquired is shown in Fig. 10. We only considered images with a diagnosis of mild cognitive impairment (MCI) or Alzheimer’s disease (AD).
After preprocessing we randomly divided the data into a training, testing and validation set. We performed the split on a subject basis rather than an image basis. The exact split is shown in Table 3. The table furthermore shows the distribution over the diagnoses on a image level, and the number of subjects which have undergone a conversion from MCI to AD in the examined time intervals.
The training data was used for learning the mask generator and critic parameters which minimise the cost function in Eq. 4 of the main article. The validation set was used for monitoring of the training based on the Wasserstein distance and visual examination of generated masks, and for hyperparameter tuning. The test set was used for the final qualitative and quantitative evaluation.
In case of interest, a list of the exact ADNI subject ID’s used in the study can be found in our public code repository (https://github.com/baumgach/vagan-code) in the folder data/subject_rids.txt.
Appendix D Alternative classifier architecture
It was suggested during the reviews that our classifier architecture with two dense layers before the final output is responsible for the poor performance of the backpropagation based saliency map techniques. It was recommended that we investigate the popular class of architectures where the final convolutions are aggregated using a global average pooling step over the spatial dimensions of the activation maps, followed by a single dense layer. Examples of this type of architecture include the works of He at al.  and Lin et al. . In our experiments, the class activation mappings (CAM) method  was also using this general architecture. In theory this may abstract the data less before the final output and perhaps produce maps that can more easily identify multiple regions in the image.
To investigate this theory we repeated the synthetic experiment (outlined in Section 4.2 of the main article), but replaced the final two dense layers in our synthetic experiments by a global average pooling and a single dense layer. After full convergence of the network from the main article and the alternative architecture, we obtained the saliency maps shown in Fig. 11. In addition to the integrated gradients method  already shown in the main article, here we also show the results for normal backprop  and guided backprop . It can be observed that indeed, with the alternative architecture, normal and guided backprop manage to correctly attribute some of the pixels of the peripheral box, albeit very faintly (emphasised with white arrows in Fig 11). However, regardless of the architecture the classifier appears to focus only on the pixels of one of the edges, which is only subset of the features characterising this class. Note that the orientation of the attributed edges depends on the random initialisation of the network.
Nevertheless, the feature attribution maps obtained using the backprop-based techniques are not of comparable quality to the maps produced by our proposed VA-GAN method. For emphasis we show the corresponding feature attribution map produced with VA-GAN plus two more samples in Fig. 12.
To conclude, we would like to note that from the point of view of saliency maps, (1) two dense layers or (1) average pooling followed by a dense layer, are conceptually similar. In both cases the final prediction aggregates information from multiple receptive fields covering the whole image. Therefore, it is not surprising that the two networks behave similarly. As outlined in the work of Shwartz-Ziv et al.  the optimisation of neural network classifiers results in a trade off between compression of input features and predictive accuracy. In both networks, the final prediction has access to all features in the image and thus has the potential to compress away features that are redundant for classification (such as one of the two boxes).