Medical image reconstruction with image-adaptive priors learned by use of generative adversarial networks

01/27/2020 ∙ by Sayantan Bhadra, et al. ∙ University of Illinois at Urbana-Champaign 20

Medical image reconstruction is typically an ill-posed inverse problem. In order to address such ill-posed problems, the prior distribution of the sought after object property is usually incorporated by means of some sparsity-promoting regularization. Recently, prior distributions for images estimated using generative adversarial networks (GANs) have shown great promise in regularizing some of these image reconstruction problems. In this work, we apply an image-adaptive GAN-based reconstruction method (IAGAN) to reconstruct high fidelity images from incomplete medical imaging data. It is observed that the IAGAN method can potentially recover fine structures in the object that are relevant for medical diagnosis but may be oversmoothed in reconstructions with traditional sparsity-promoting regularization.



There are no comments yet.


page 5

page 6

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A linear discrete-to-discrete imaging system is considered [barrett2013foundations]:



denotes the system matrix. The vectors

and denote the measurement data and random noise, respectively. The vector represents a finite-dimensional approximation of the measured object’s property distribution. Image reconstruction methods seek to estimate the unknown object f from the observed measurement data g. Such linear inverse problems are often ill-posed, e.g. when as in the case of compressed sensing MRI applications.

To deal with such ill-posed problems, penalized least squares optimization problems are sometimes solved:



denotes the regularization or penalty term that encodes the prior on the object, and the hyperparameter

controls the strength of regularization. Sparsity-promoting penalties such as the -norm of the wavelet transform or the total variation (TV) semi-norm are able to effectively regularize some of these ill-posed linear inverse problems [barrett2013foundations, rudin1992TV, lustig2007CS, tian2011low].

However, hand-crafted sparsity-promoting penalties may not be able to comprehensively represent the prior knowledge of the sought after object property. Recently, deep generative models such as generative adversarial networks (GANs) [goodfellow2014generative] have shown great promise in estimating the prior distributions for images. In the context of medical imaging, such learned priors can be used to perform image reconstruction from incomplete and/or noisy measurement data. Existing approaches use GANs to transform initial images obtained from undersampled measurements using conventional methods, e.g. zero-filled backprojection to artifact-free images while maintaining data consistency [quan2017cyclicGANMRI, mardani2019gancs]. These methods combine learning the prior distribution and the reconstruction tasks together during training. This requires the network to be trained separately each time the data-acquisition parameters are changed. Bora et al. [bora2017csgm] proposed a method in which the training of the GAN and the reconstruction task can be treated separately. In this framework, known as Compressed Sensing using Generative Models (CSGM), a generative model is trained such that the generator can learn to map from simple low-dimensional latent distributions (e.g. uniform, standard normal etc.) to the high-dimensional object distribution. The distribution learned by the generator captures the prior knowledge over the object distribution. Subsequently, image reconstruction is performed by finding the latent vector for which the corresponding image in the object space agrees with the observed measurements. Therefore, the GAN needs to be trained only once to learn the prior that describes the object distribution, and the pre-trained generator can be used to reconstruct images from measurements obtained using imaging systems with different data-acquisition parameters.

Still, in practice, it is difficult for a GAN to span all possible images that may arise from the actual distribution. Hence, by constraining the reconstructed image to lie in the range of the generator in the CSGM framework, a potential lack of fidelity may be introduced between the reconstructed image and the observed measurements in the measurement space of the imaging operator H [barrett2013foundations]. In order to mitigate the problem of limited representation capabilities of a GAN, Hussein et al. [hussein2019IAGAN] proposed an image-adaptive GAN-based (IAGAN) reconstruction framework, where the trained generative model parameters are further tuned to be consistent with the observed measurement data. This results in a higher fidelity with the observed measurements while still maintaining the learned prior over the imaging object obtained by pre-training the GAN.

In this study, we investigate the application of the IAGAN formulation to image reconstruction in MRI. A state-of-the-art GAN called Progressive Growing of GANs (ProGAN) [karras2018PGGAN] was trained on the publicly available NYU fastMRI dataset ( containing knee MRI images and associated k-space measurements. The learned generative model was employed in the IAGAN framework to reconstruct images from highly subsampled k-space data belonging to a previously unseen validation dataset. It is demonstrated that by using an image-adaptive GAN-based reconstruction method with incomplete measurement data, we can obtain high fidelity images and recover fine structures relevant for medical diagnosis, as compared with traditional regularized reconstruction methods that rely on sparsity-promoting penalties.

2 Image reconstruction with image-adaptive priors learned by use of Generative Adversarial Networks

2.1 Generative Adversarial Networks (GANs)

Generative Adversarial Networks (GANs) [goodfellow2014generative] are recent deep learning methods that have shown promising results in learning data distributions. In GANs, a generator network and a discriminator network are trained though an adversarial process. Here, a true object image is sampled from a data distribution . The generator maps a random vector to a synthetic object image , where

is the mapping represented by a neural network with parameters

. The discriminator is an inference network, parameterized by , that represents a mapping of the input image (f or ) to a real-valued scalar. In the adversarial process, is trained to maximally differentiate the synthetic image f from the true image , and is trained to maximally fool such that the generated synthetic image

is wrongly classified as a true image. This adversarial process can be represented by a two-player minimax game with value function



where represents a suitable objective function. When the global optimum of this minimax game is achieved, the synthetic images generated by the generator can not be differentiated from the true images by using any observer, and the synthetic image distribution equals the true image distribution : . Let and denote the optimal parameters for and respectively after stable convergence has been reached in the above minimax game.

2.1.1 Progressive Growing of GANs (ProGAN)

In practice, however, stabilization of GAN training has been known to be difficult [deeplearningbook2016]. This is primarily due to the unstable nature of the adversarial learning process, which involves the generator and the discriminator being trained simultaneously with competing objectives. This has served as a bottleneck in using GANs to reliably generate high-resolution images. Recently, Karras et al. [karras2018PGGAN] proposed a training strategy for GANs that has mitigated the stabilization problem of GAN training to great effect and resulted in GANs being able to generate realistic natural images at resolutions as high as pixels. In this novel learning strategy called Progressive Growing of GANs (ProGAN), the training starts from low-resolution images and layers are added progressively to both the generator and the discriminator networks to increase the resolution (Fig. 1). Such a progressive training strategy has resulted in higher stability in the training process for GANs and improved quality in the generated images, making ProGAN the current state-of-the-art method for training GANs on image data.

Figure 1: ProGAN: Training starts with generator and discriminator corresponding to low spatial resolution of pixels. As training progresses, layers are added to and to gradually increase the spatial resolution of the generated images towards the final resolution, which for our study is .

2.2 Image-adaptive GAN-based reconstruction (IAGAN) for medical imaging

Once a GAN has been stably trained and can generate images similar to samples from the true data distribution, the learned generator network can be used as a prior for solving linear inverse problems. In the context of medical imaging, the learned prior may be employed for reconstructing images from incomplete and/or corrupted measurement data. Given a pre-trained generative model, Bora et al. [bora2017csgm] proposed to find a vector in the latent space of the generator such that the corresponding image agrees with the observed measurement data. This leads to the following minimization problem:


where the final reconstructed image is formed as . However, it is difficult in practice to train perfect generators, and as a result, the generative model may not be able to span the entire range of the actual object distribution. This may lead to a lack of fidelity between the reconstruction obtained and the observed measurements. In order to mitigate this problem, Hussein et al. [hussein2019IAGAN] proposed an image-adaptive GAN-based (IAGAN) reconstruction framework where the image is constrained to lie in the range of while at the same time the trained generator’s parameter weights are further tuned to enforce consistency with the observed measurement data. With the parameters of now denoted by and initialized as , z and are jointly estimated as:


where the final reconstructed image is formed as . The authors also proposed to initialize the latent vector z with the optimal solution obtained from solving Eq.(4) to better condition the optimization problem. Since the generator network is differentiable, any suitable stochastic gradient-based method may be applied to solve Eq.(5).

Additionally, regularization on the generative model in the form of a sparsity-promoting penalty may be added to the IAGAN framework to further mitigate artifacts resulting from data incompleteness and/or when the measurements contain a high level of noise. The optimization problem in Eq.(5) may be modified as follows:


with , where is a suitable sparsity-promoting penalty function and is a hyperparameter that controls the strength of regularization. In our experiments, we consider to be the total variation (TV) semi-norm [barrett2013foundations], and we refer to the method represented by Eq.(6) as IAGAN-TV. In this way, traditional sparsity-promoting regularization may be combined with GAN-constrained solutions, which may potentially enhance the quality of the reconstructed image as compared with using either method alone.

3 Numerical Studies

3.1 Imaging system

In this preliminary study, the MR imaging system contains a single-coil with the forward operator H

corresponding to the discrete Fourier transform (DFT). Emulated single-coil (ESC)

[tygertESC2019] data from multi-coil acquisitions were employed for the reconstruction experiments as a first step to demonstrate the proposed method without involving the complexity of multiple receiver coils.

3.2 Dataset

Zbontar et al. [fastMRI2018] recently released an open dataset called NYU fastMRI which contains a large number of raw MR k-space measurements as well as clinical MR images of human knees. The dataset includes both raw multi-coil and the corresponding ESC k-space data for 1594 knee volumes across different standard clinical MR systems and pulse sequences. The volumes are split into separate training, validation, test and challenge datasets. In our study, the ProGAN was trained using images obtained from the training dataset, and the IAGAN reconstruction method was applied on retrospectively subsampled full k-space data from the validation dataset. The test and challenge datasets were not considered for this study as they do not contain full k-space measurements, and hence ground truth images which can serve as gold standard reference can not be obtained for these datasets. The training images with full field-of-view (FOV) were obtained by performing the root sum-of-squares reconstruction method [roemerRSS1990] over the centrally cropped fully sampled region from the multi-coil k-space data in the training dataset. Excluding the first three noisy slices for each volume, this resulted in 31,823 training images with a spatial resolution of pixels. Each training image slice was normalized by the maximum intensity in the corresponding volume.

3.3 ProGAN training details

For training the ProGAN, the code published by Karras et al. at

and implemented in Tensorflow

[abadi2016tensorflow] was employed. The default settings for the training parameters were used for our experiments. The training was performed on a system with an Intel Xeon E5-2620v4 Central Processing Unit (CPU) @ 2.1 GHz and 4 NVIDIA TITAN X Graphical Processing Units (GPUs).

3.4 Baseline and reconstruction details

The IAGAN framework was employed to reconstruct images from undersampled ESC k-space data from a slice in a volume in the validation dataset. A volume obtained with the coronal proton density without fat suppression (CORPD) data-acquisition protocol was considered in our reconstruction experiments. The ground truth for comparing reconstructions was obtained by performing the inverse discrete Fourier transform (IDFT) of the fully sampled ESC k-space data. Variable-density Poisson-disc sampling [sparseMRI] with an acceleration factor of was used to undersample the full raw k-space data (Fig. 2). The sampling pattern was generated using the Berkeley Advanced Reconstruction Tolbox (BART) [BART]. Zero-filled (ZF) reconstruction refers to the IDFT of the k-space zero-filled after subsampling and contains severe aliasing artifacts. As a reference, we consider a penalized least squares solution with TV penalty (PLS-TV) as in Eq.(2), obtained by using the BART toolbox. The hyperparameter was chosen by peforming a grid search and selecting the value that resulted in the lowest mean square error (MSE) of the reconstructed image with respect to the ground truth. Images were reconstructed with the CSGM, IAGAN and IAGAN-TV methods using Tensorflow [abadi2016tensorflow]. The Adam optimizer [kingma2014adam]

was used to perform stochastic gradient descent for solving Eq.(

4), Eq.(5) and Eq.(6) respectively.

MR images have both magnitude and phase components. Since the reconstructions are not being performed directly from multi-coil acquisitions but rather from the corresponding emulated single-coil data, the phase information cannot be recovered with reasonable accuracy. Hence, for this preliminary study, the phase information from the fully sampled ESC k-space data was retrospectively added into all the considered reconstruction methods with subsampled ESC data.

Figure 2: Variable-density Poisson-disc sampling pattern in k-space with acceleration factor . The central region of the k-space is fully sampled.

4 Results

4.1 ProGAN training results

After the ProGAN was trained, the images produced by the generator highly resembled the true knee images in the training dataset. For visual comparison, samples of images from the training dataset as well as samples of images generated by the ProGAN after training, cropped to the central region of interest (ROI), are shown in Fig. 3.

Figure 3: (Top) Samples from knee MRI images in the training dataset (Bottom) Samples from generated knee MRI images after training with the ProGAN

4.2 Reconstruction from undersampled validation k-space data using IAGAN

Images reconstructed by use of the IAGAN and the IAGAN-TV methods are compared with the baseline PLS-TV algorithm as well as the CSGM method in Fig. 4.

Figure 4: (Top row) from left to right: Ground truth, ZF reconstruction, PLS-TV reconstruction, CSGM reconstruction, IAGAN reconstruction and IAGAN-TV reconstruction. The meniscular region indicated by the green bracket in the ground truth image is expanded for each reconstruction result at the bottom right corner of each image. (Bottom row) from left to right: Error maps for ZF, PLS-TV, CSGM, IAGAN and IAGAN-TV reconstructions respectively.

From the results, it can be observed that the IAGAN reconstruction contains fine details in regions that are of potential interest for medical diagnosis and retains the bone texture to a large degree, as compared with the PLS-TV method where these relevant features could not be reliably recovered. This can be further highlighted with an expanded view of the lateral meniscus for each reconstructed image. Radiologists rely on a clear view of the meniscular region in order to detect tears in the knee [vohra2011knee]. However, it can be observed that the meniscus appears oversmoothed in the PLS-TV reconstruction, while for the IAGAN reconstruction, it remains sharp with discernible fine features. On the other hand, the image produced by the CSGM method contains detailed features and texture information similar to knee images in the training data, but lacks fidelity with the observed measurements. The IAGAN-TV reconstruction regularizes the IAGAN solution and can remove some of the grain-like artifacts that may appear due to data incompleteness. The error maps with respect to the ground truth further illustrate the points above.

5 Conclusion

This study demonstrates the use of an image-adaptive GAN-based (IAGAN) algorithm to reconstruct high fidelity MR images from noisy and/or incomplete measurement data. A ProGAN was trained on a publicly available knee MRI dataset and the learned generator was employed in the IAGAN framework to reconstruct images from emulated single-coil raw data. Reconstructed images illustrate that the IAGAN method can recover fine features with diagnostic relevance in the image which may be oversmoothed by traditional sparsity-based reconstruction methods. Moreover, the IAGAN algorithm maintains data consistency while the CSGM method fails to maintain fidelity with the observed measurements, which is critical for medical diagnosis. In the context of MRI, it will be important to extend the current implementation to reconstruct images from subsampled multi-coil raw data, as well as investigate the impact of IAGAN-based reconstruction across different patient volumes, slices and data-acquisition protocols. Comparison of the IAGAN method with other recent deep learning-based reconstruction solutions for accelerated MRI, [yang2016admmnet, hammernik2018vnn] as well as existing GAN-based reconstruction approaches [quan2017cyclicGANMRI, mardani2019gancs] will be studied in the future. Further, the IAGAN may be implemented in other imaging modalities such as low-dose x-ray CT, where such a GAN-constrained reconstruction method may help to reduce the amount of ionizing radiation in the patient’s body. Finally, there remains a need to perform reader studies as well as quantitative analysis of the IAGAN-based reconstruction method.


This work is original and has not been submitted for publication or presentation elsewhere.