Magnetic resonance imaging (MRI) is a commonly used non-invasive medical imaging modality that provides soft tissue contrast of excellent quality as well as high resolution structural information. The most significant drawback of MRI is its long acquisition time as the raw data is acquired sequentially in the k-space which contains the spatial-frequency information. This slow imaging speed can cause patient discomfort, as well as introduce artefacts due to patient movement.
Compressive sensing (CS)  can be used to accelerate the MRI acquisition process by undersampling the k-space data. Reconstruction of CS-MRI is an ill-posed inverse problem . Conventional CS-MRI frameworks assume prior information on the structure of MRI by making use of predefined sparsifying transforms such as the discrete wavelet transform, discrete cosine transform, etc. to render the reconstruction problem well-posed . Instead of using predefined transforms, the sparse representation can be learnt from the data itself, i.e. dictionary learning (DLMRI) . These frameworks however, suffer from the long computation time taken by iterative optimization processes  as well as the assumption of sparse signals , which might not be able to fully capture the fine details .
Bora et al. 
have shown that instead of using the sparsity model, the CS signal can be recovered using pretrained generative models, where they use an iterative optimization to obtain the reconstructed signal. Another deep learning based approach was introduced by Yanget al. , where alternating direction method of multipliers  is used to train the network (DeepADMM) for CS-MRI reconstruction.
Recent works [21, 13] demonstrate the application of generative adversarial networks (GANs)  to reconstruct CS-MRI. In these works, the use of a large set of CS-MR images and their fully sampled counterparts for training the GAN model can facilitate the extraction of prior information required to solve the reconstruction problem . The trained model is then used to obtain the reconstructed output for a new CS-MR image in a very short time.
However, these state-of-the-art methods often lack structural information in the reconstructed ouput. Super resolution (SR) is another well-known inverse problem that tries to interpolate both low frequency and high frequency components from a low resolution image. We have borrowed some of the ideas from the SR problem and incorporated them in CS-MRI reconstruction problem to achieve robust generation with subtle structural information. The main contributions of this work are:
We propose the use of patchGAN discriminator  for better reconstruction of high frequency content in the MR images.
To stabilize the training of GAN, we make use of the Wasserstein loss .
In order to make the reconstruction robust to noise, we propose the use of noisy images for data augmentation to train our GAN model.
The acquisition model for the CS-MRI reconstruction problem in discrete domain can be described as:
is a vector formed by the pixel values in thedesired image, denotes the observation vector, and is the noise vector. denotes the set of complex numbers. is an operator which describes the process of random undersampling in the k-space. Given an observation vector , the reconstruction problem is to find out the corresponding , considering to be a non-zero vector. We choose to find the solution to this reconstruction problem using GAN model.
A GAN model comprises of a generator and a discriminator , where the generator tries to fool the discriminator by transforming input vector to the distribution of true data . On the other hand, the discriminator attempts to distinguish samples of from generated samples .
We incorporate the conditional GAN (cGAN) based framework  in our study. The model is conditioned on the aliased zero-filled reconstruction (ZFR) (size ), given by , where denotes the Hermitian operator. Instead of using a binary cross-entropy based adversarial loss for training the cGAN model, we use the Wasserstein loss. This helps in stabilizing the training process of standard GANs, which suffer from saturation resulting in vanishing gradients. Mathematically, the cGAN model with the Wasserstein loss solves the following optimization problem:
where denotes the expectation over a batch of images.
Fig. 1 shows the generator architecture of the proposed model, which is inspired from . The architecture is based on a U-net , which consists of several encoders and corresponding decoders. Each encoder is in the form of a convolutional layer, which decreases the size and increases the number of feature maps. Each decoder consists of a transposed convolutional layer, to increase the size of the feature maps. In order to transfer the features of a particular size from the encoder to the corresponding decoder, skip connections are present. Instead of obtaining feature maps of size lower than using more encoders (and decoders), the proposed architecture consists of RRDBs at the bottom of the U-net. Each RRDB consists of dense blocks, as well as residual connections at two levels: across each dense block, and across all the dense blocks in one RRDB, as shown in Fig. 1. The output of each dense block is scaled by
before it is added to the identity mapping. Residual connections make the length of the network variable thereby making identity mappings easier to learn and avoid vanishing gradients in the shallower layers. Dense connections allow the transfer of feature maps to deeper layers, thus increasing the variety of accessible information. Throughout this network, batch normalization (BN) and leaky rectified linear unit (ReLU) activation are applied after each convolutional layer. At the output, a hyperbolic tangent activation is used.
The discriminator is a convolutional neural network with 11 layers. Each layer consists of a convolutional layer, followed by BN and leaky ReLU activation. In order to improve the reconstruction of the high frequency details, a patchGAN framework is incorporated, which tries to score each patch of the image separately, and then gives the average score as the final output.
In order to reduce the pixel-wise difference between the generated image and the corresponding ground truth (GT) image, a mean absolute error (MAE) based loss is incorporated while training the generator. It is given by:
where denotes the norm. Since the human vision system is sensitive to structural distortions in images, it is important to preserve the structural information in MR images, which is crucial for clinical analysis. In order to improve the reconstruction of fine textural details in the images, a mean SSIM (MSSIM)  based loss is also incorporated, as follows:
where is the number of patches in the image, and SSIM is calculated as follows:
where and represent two patches, and and
denote the mean and variance, respectively.and are small constants to avoid division by zero.
The overall loss for training the generator is given by:
where , , and are the weighting factors for various loss terms.
3 Results and Discussion
3.1 Training settings
In this work, a 1-D Gaussian mask is used for undersampling the k-space. Since the ZFR is complex valued, the real and imaginary components are concatenated and passed to the generator in the form of a two channel real valued input. The batch size is set as 32. The discriminator is updated three times before every generator update. The threshold for weight clipping is 0.05. The growth rate for the dense blocks is set as 32, is 0.2, and 12 RRDBs are used. The number of filters in the last layer of each RRDB is 512. Adam optimizer  is used for training with and . The learning rate is set as for the generator and for the discriminator. The weighting factors are , , and
3.2 Data details
For the purpose of training and testing, T-1 weighted MR images of brain from the MICCAI 2013 grand challenge dataset  are used. In order to make the reconstructed output robust to noise, data augmentation is carried out using images with 10% and 20% additive Gaussian noise. To make the set of training images, 19 797 images are randomly taken from the training set of the aforementioned dataset. Out of these, noise is added to 6335 images, while the remaining 13 462 images are used without any noise. In addition, 990 images are chosen from the 13 462 noise-free images, and noise is added to them also, to get a total of 20 787 images for training. Among the noisy images, number of images with 10% and 20% noise is equal. Thus, the set contains 64.76% noise-free images, 30.48% noisy images whose corresponding noise-free images are not present in the training set, and 4.76% noisy images whose corresponding noise-free images are present in the training set. For testing purposes, 2000 images are chosen randomly from the test set of the dataset. The tests are conducted in three stages: using noise-free images, using images with 10% noise added to them, and using images with 20% noise.
Table 1 summarizes the quantitative results to study the effect of addition of various components to the model. These results are reported for images in which 20% of the raw k-space samples are retained. For all the four cases, the generator is trained with
. In the first case, the GAN model comprises of a U-net generator and a patchGAN discriminator, with BN present throughout the network. In the subsequent cases, the use of RRDBs, and addition of BN to RRDBs results in significant improvement in peak signal to noise ratio (PSNR). The use of data augmentation with noisy images, in the fourth case, results in significantly better quantitative results for the reconstruction of noisy images, as compared to the previous three cases.
|BN in RRDBs||✗||✗||✓||✓|
|Images||PSNR (dB) / MSSIM|
|Noise-free||40.45 / 0.9865||41.39 / 0.9810||41.88 / 0.9829||42.31 / 0.9841|
|10% noise added||38.25 / 0.9641||38.03 / 0.9624||38.03 / 0.9620||39.80 / 0.9751|
|20% noise added||33.98 / 0.9217||34.01 / 0.9210||33.78 / 0.9180||37.56 / 0.9619|
The qualitative results of the proposed method are shown in Fig. 2(b-e) for 20% undersampling and Fig. 2(f-i) for 30% undersampling. It can be seen that the proposed method is able to reconstruct the structural content in the image, including many fine details, successfully. This is also indicated by the quantitative results shown in Table 1. Also, the contrast of the reconstructed image looks very similar to that of the GT. The reconstruction results for noisy inputs, as well as their differences with the corresponding GT, indicate the robustness of the model.
|Method||Noise-free images||10% noise added||20% noise added|
|PSNR (dB) / MSSIM||PSNR (dB)||PSNR (dB)|
|ZFR*||35.61 / 0.7735||24.42||18.56|
|DLMRI*||38.24 / 0.8020||24.52||18.85|
|Noiselet*||39.01 / 0.8588||25.29||19.42|
|BM3D*||39.93 / 0.9125||24.71||18.65|
|DeepADMM*||37.36 / 0.8534||25.19||19.33|
|DAGAN*||40.20 / 0.9681||37.40||34.23|
|Proposed||46.88 / 0.9943||42.34||39.49|
*PSNR and MSSIM values are taken from .
Table 2 shows the comparison of the proposed method with ZFR and some state-of-the-art methods like DLMRI , Noiselet , BM3D , DeepADMM , and DAGAN . These results are reported for images in which 30% of the k-space data is retained. It can be seen that both the PSNR and MSSIM for the proposed method are significantly better than the previous methods. The comparison between the PSNR of reconstruction results for noisy images shows that the proposed method is highly robust to noise. Moreover, the reconstruction time of the proposed method is 9.06 ms per image on a GPU, which can facilitate real-time reconstruction of MR images.
We also tested the model trained on MR images of the brain to reconstruct MR images of canine legs from the MICCAI 2013 challenge. Fig. 3 shows the results of this zero-shot inference for images in which 20% of the k-space data is retained. Though no images of canine legs were used for training, the model is able to faithfully reconstruct most of the structural content, and is able to achieve average PSNR and MSSIM values of 41.28 dB and 0.9788, respectively, for 2000 test images.
In this paper, a novel GAN based framework has been utilized for CS-MRI reconstruction. The use of RRDBs in a U-net based generator architecture increases the amount of information available. In order to preserve the high frequency content as well as the structural details in the reconstructed output, a patchGAN discriminator and SSIM based loss have been incorporated. The use of noisy images during training makes the reconstruction results highly robust to noise. The proposed method is able to outperform the state-of-the-art methods, while maintaining the feasibility of real-time reconstruction. In future, we plan to analyze the performance of the proposed model for different k-space sampling patterns. In order to improve the reconstruction time, we plan to work on lightweight architectures. Further work may be carried out on devising regularization terms that help to preserve the finest of details in the reconstructed output.
Wasserstein generative adversarial networks.
Proceedings of the 34th International Conference on Machine Learning, Vol. 70, International Convention Centre, Sydney, Australia, pp. 214–223. External Links: Cited by: 4th item.
-  (2017) Compressed sensing using generative models. In Proceedings of the 34th International Conference on Machine Learning, Vol. 70, pp. 537–546. External Links: Cited by: §1.
-  (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 3 (1), pp. 1–122. External Links: Cited by: §1.
-  (2006-04) Compressed sensing. IEEE Transactions on Information Theory 52 (4), pp. 1289–1306. External Links: Cited by: §1.
-  (2016-11-01) Decoupled algorithm for MRI reconstruction using nonlocal block matching model: BM3D-MRI. Journal of Mathematical Imaging and Vision 56 (3), pp. 430–440. External Links: Cited by: §3.3, Table 2.
-  (2014) Generative adversarial nets. In Advances in Neural Information Processing Systems 27, pp. 2672–2680. External Links: Cited by: §1.
-  (2010-11) Compressive sensing principles and iterative sparse recovery for inverse and ill-posed problems. Inverse Problems 26 (12), pp. 125012. External Links: Cited by: §1.
-  (2017-07) Image-to-image translation with conditional adversarial networks. In , Vol. , pp. 5967–5976. External Links: Cited by: 2nd item.
-  (2015) Adam: A method for stochastic optimization. See DBLP:conf/iclr/2015, External Links: Cited by: §3.1.
-  2013 Diencephalon standard challenge. Cited by: §3.2.
-  (2015) Balanced sparse model for tight frames in compressed sensing magnetic resonance imaging. PLOS ONE 10 (4), pp. 1–19. External Links: Cited by: §1.
-  (2007) Sparse MRI: the application of compressed sensing for rapid MR imaging. Magnetic Resonance in Medicine 58 (6), pp. 1182–1195. External Links: Cited by: §1.
-  (2019-01) Deep generative adversarial neural networks for compressive sensing MRI. IEEE Transactions on Medical Imaging 38 (1), pp. 167–179. External Links: Cited by: §1.
-  (2014) Conditional generative adversarial nets. ArXiv abs/1411.1784. Cited by: §2.
-  (2015) Multichannel compressive sensing MRI using noiselet encoding. PLOS ONE 10 (5), pp. 1–27. External Links: Cited by: §3.3, Table 2.
-  (2010) Iterative thresholding compressed sensing mri based on contourlet transform. Inverse Problems in Science and Engineering 18 (6), pp. 737–758. External Links: Cited by: §1.
-  (2011-05) MR image reconstruction from highly undersampled k-space data by dictionary learning. IEEE Transactions on Medical Imaging 30 (5), pp. 1028–1041. External Links: Cited by: §1, §3.3, Table 2.
-  (2015) U-net Convolutional networks for biomedical image segmentation. ArXiv abs/1505.04597. Cited by: 1st item, §2.
-  (2018-09) ESRGAN: enhanced super-resolution generative adversarial networks. In The European Conference on Computer Vision (ECCV) Workshops, pp. 63–79. Cited by: 1st item.
-  (2019-05) Fast compressive sensing recovery using generative models with structured latent variables. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vol. , pp. 2967–2971. External Links: Cited by: §1.
-  (2018-06) DAGAN: Deep de-aliasing generative adversarial networks for fast compressed sensing MRI reconstruction. IEEE Transactions on Medical Imaging 37 (6), pp. 1310–1321. External Links: Cited by: §1, §2, §3.3, Table 2.
-  (2016) Deep ADMM-Net for compressive sensing MRI. In Advances in Neural Information Processing Systems 29, pp. 10–18. External Links: Cited by: §1, §3.3, Table 2.
-  (2004-04) Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13 (4), pp. 600–612. External Links: Cited by: 3rd item, §2.
-  (2017-03) Loss functions for image restoration with neural networks. IEEE Transactions on Computational Imaging 3 (1), pp. 47–57. External Links: Cited by: 3rd item.