Deep Generative Adversarial Neural Networks for Realistic Prostate Lesion MRI Synthesis

08/01/2017 ∙ by Andy Kitchen, et al. ∙ 0

Generative Adversarial Neural Networks (GANs) are applied to the synthetic generation of prostate lesion MRI images. GANs have been applied to a variety of natural images, is shown show that the same techniques can be used in the medical domain to create realistic looking synthetic lesion images. 16mm x 16mm patches are extracted from 330 MRI scans from the SPIE ProstateX Challenge 2016 and used to train a Deep Convolutional Generative Adversarial Neural Network (DCGAN) utilizing cutting edge techniques. Synthetic outputs are compared to real images and the implicit latent representations induced by the GAN are explored. Training techniques and successful neural network architectures are explained in detail.



There are no comments yet.


page 3

page 7

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Generative Adversarial Neural Networks (GANs) are state of the art machine learning models that can learn the statistical regularities of input data and then generate a nearly endless stream of synthetic examples that resemble, but do no exactly replicate, the input data

[1]. These models have been applied to generate a variety of natural images, including images of bedrooms, faces[2] and animals[3]. In this work GANs are applied to generate realistic looking synthetic images of prostate lesions resembling the SPIE ProstateX Challenge 2016 training data. Multiple aligned MRI modalities are generated simultaneously, and the model produces compelling results with a relatively small amount of training data.

The ability to create synthetic data that resembles real data in key statistical aspects is well studied and particularly important in the medical field where anonymity is critical. In the appropriate circumstances, machine learning or data mining can be carried out on surrogate synthetic data instead of raw sensitive data, giving improved anonymization. When there are only a small number of training examples, generated data can be used as extra training data, a powerful way to combat overfitting and increase model performance[3]. Synthetic image generation can also be used as an aid for education and medical training.

2 Generative Models

Generative models are distinct from discriminative models because they capture the distribution of data itself, instead of the conditional probability of a label given data. This data distribution can then be sampled; a process of generating new data that ‘looks like’ real data.


is a random variable taking values from the input domain and

is a random variable of associated labels. is the conditional distribution of labels given data and is the data distribution itself.

Generative models are often much more complicated than discriminative models. They must capture all the intricacies of data not just the parts specific to a label. For example, with a blood test, a statistical model may only need to look for elevated levels in one or two dimensions to indicate a pathology, but generating a whole new blood panel that looks as if it had come from a patient would require capturing complex interdependencies between different levels.

3 Generative Adversarial Models

An adversarial model is formalized as a game played between two players, with distinct competing objectives. Called the generator and a discriminator . is an ‘artist’ that tries to create realistic looking images.

is a ‘critic’ that tries to classify images as fake, created by the artist; or as a real images sampled from the world. The principal equilibrium strategy in this game is for

to draw from in which case performs no better than random guessing, i.e. the best way for to fool is to create images that are indistinguishable from real images (according to ).

Interestingly, unlike most models, there is no global loss function that must be minimized, instead these models are trained to an equilibrium point where neither player can improve their performance given a small unilateral change to their strategy; where their strategy is represented by continuous neural network weights. A leap-frog gradient descent algorithm is used for training, where a gradient descent step is taken for

with held constant, then with held constant. With some luck and under conditions that are in general not well understood this algorithm can move both players into a suitable equilibrium strategy.

This method is particularly powerful if the discriminative models are large Deep Convolutional Neural Networks. If there are any recognizable statistical aberrations in the data generated by then can catch out the generator by recognizing these aberrations. Unrealistic structures are thus suppressed when training has reached equilibrium — produces highly realistic samples.

4 Practical Training of GANs

GANs are already notorious for being hard to train, equilibrium strategies are often unstable and hard to reach compared to the optima of a single function. If either or are too powerful, one will dominate the other, gradients will vanish and the models will become stuck in a poor equilibrium, often producing images that look like noise or have no content. In general and must be designed together and matched in terms of power i.e. they should be commensurate in terms of layer size and depth. Implementors should be aware that only certain combinations of generator and discriminator will work well together, and compatibility is hard to predict in advance. The authors recommend iterative development informed by existing literature, intuition, and empirical testing.

It can also be beneficial to introduce a large amount of activation noise and dropout into , allowing

to compete with a wide variety of slightly different strategies; this can help to escape from poor equilibria. Using batch normalization and special activation functions has also show to be effective in some cases


5 Method

5.1 Data Preparation

Figure 1: T2-weighted MRI with Patch Region

All training data is extracted from the SPIE ProstateX Challenge 2016 data set and prepared using the same method the authors used for competition entries[4]. Patches of in size are extracted around the centres of 330 prostate lesion MRI scans at a resolution of . Three modalities are aligned and utilized: T2, ADC and . All channels are normalized to approximately lie in the range . Each input image patch has three channels, one for each modality. See figure 1 for diagram.

5.2 Generator Architecture

Figure 2: Generator Neural Network Schematic
kernel feats. out. shape
random input
fully connected

T. conv. / ReLU

T. conv. / ReLU
T. conv. / ReLU
Table 1: Generator Neural Network Details

The generator neural network has 5 layers and includes transposed convolutional layers [5]

(also called ‘deconvolutional’ layers). The input is a 25 dimensional vector of standard normal random numbers, followed by a fully connected layer and 3 transposed convolutions. See figure

2 for a schematic and table 1 for layer details.

5.3 Discriminator Architecture

Figure 3: Discriminator Neural Network Schematic
kernel feats. out. shape noise
input gaussian
conv. / L. ReLU 32 gaussian
conv. / L. ReLU 64 gaussian
conv. / L. ReLU 128 gaussian
global avg. pool dropout
fully connected
Table 2: Discriminator Neural Network Details

The discriminator neural network has 6 layers, an initial image input layer, 3 layers of convolutions followed by global average pooling and a fully connected layer. The final hidden layer uses dropout; all other hidden layers have gaussian noise added. To try and improve gradient flow by preventing saturation ‘leaky’ ReLU activation functions are used: , where in this work. Gaussian noise is drawn from . See figure 3 for a schematic and table 2 for details.

5.4 Training Objective

Formulas essentially the same as the empirical cross entropy are used for both and loss functions[1]:


Where is the discriminator loss function, is the generator loss function. and are the respective neural network parameters. is the probability that assigns to being real. is a set of images generated by for random normal inputs . is a sample of natural images from . The first sum of equation 1 is taken over fake images and penalizes high probabilities from , the second term is taken over real images and penalizes low probabilities. is the number of fake images in a batch, is the number of real images in a batch.

5.5 Training Procedure

A leapfrog gradient descent is used to find an equilibrium point of the GAN game. The following updates are iterated until convergence:


Where is a vector of generator neural network parameters, and are the discriminator parameters, and are their respective loss functions. The arrow indicates application of the Adam accelerated gradient descent algorithm for the update[6]. The model is trained for 15,000 iterations with a batch size of 200 (200 fake and 200 real images).

6 Results

Figure 4: Comparison of Real and Synthetic Image Patches
Figure 5: Interpolation in space

See figure 4 for a full page comparison of real and synthetic images. Qualitatively the synthetic T2 mode has captured the rough broken textures of the real patches, the ADC mode correctly darkens the lesion centre. The mode displays large coherent blobs similar to how they appear in real data, notice that bright areas are accompanied by matching darker regions in the ADC mode, this is a benefit of simultaneously generating all modes together, they are coherent with each other.

For any random input , should fool with a high probability. Thus the input space of forms an implicit latent representation of prostate lesions. See figure 5 for example of linear interpolation between two lesion images in space. There is a smooth transition between two lesion morphologies, demonstrating the high quality of the implicit latent representation.

7 Disclosures

All included research has been independently self funded by the authors outside of the institutional system. No conflicts of interest, financial or otherwise, are declared by the authors.

8 Acknowledgments

The authors would like to acknowledge the organizers of the SPIE ProstateX Challenge 2016 for their hard work in organizing the competition and preparing the training data used in this work.