1 Introduction
Magnetic resonance imaging (MRI) is a commonly used noninvasive medical imaging modality that provides soft tissue contrast of excellent quality as well as high resolution structural information. The most significant drawback of MRI is its long acquisition time as the raw data is acquired sequentially in the kspace which contains the spatialfrequency information. This slow imaging speed can cause patient discomfort, as well as introduce artefacts due to patient movement.
Compressive sensing (CS) [4] can be used to accelerate the MRI acquisition process by undersampling the kspace data. Reconstruction of CSMRI is an illposed inverse problem [7]. Conventional CSMRI frameworks assume prior information on the structure of MRI by making use of predefined sparsifying transforms such as the discrete wavelet transform, discrete cosine transform, etc. to render the reconstruction problem wellposed [12]. Instead of using predefined transforms, the sparse representation can be learnt from the data itself, i.e. dictionary learning (DLMRI) [17]. These frameworks however, suffer from the long computation time taken by iterative optimization processes [16] as well as the assumption of sparse signals [12], which might not be able to fully capture the fine details [11].
Bora et al. [2]
have shown that instead of using the sparsity model, the CS signal can be recovered using pretrained generative models, where they use an iterative optimization to obtain the reconstructed signal. Another deep learning based approach was introduced by Yang
et al. [22], where alternating direction method of multipliers [3] is used to train the network (DeepADMM) for CSMRI reconstruction.Recent works [21, 13] demonstrate the application of generative adversarial networks (GANs) [6] to reconstruct CSMRI. In these works, the use of a large set of CSMR images and their fully sampled counterparts for training the GAN model can facilitate the extraction of prior information required to solve the reconstruction problem [20]. The trained model is then used to obtain the reconstructed output for a new CSMR image in a very short time.
However, these stateoftheart methods often lack structural information in the reconstructed ouput. Super resolution (SR) is another wellknown inverse problem that tries to interpolate both low frequency and high frequency components from a low resolution image. We have borrowed some of the ideas from the SR problem and incorporated them in CSMRI reconstruction problem to achieve robust generation with subtle structural information. The main contributions of this work are:

We propose the use of patchGAN discriminator [8] for better reconstruction of high frequency content in the MR images.

To stabilize the training of GAN, we make use of the Wasserstein loss [1].

In order to make the reconstruction robust to noise, we propose the use of noisy images for data augmentation to train our GAN model.
2 Methodology
The acquisition model for the CSMRI reconstruction problem in discrete domain can be described as:
(1) 
where
is a vector formed by the pixel values in the
desired image, denotes the observation vector, and is the noise vector. denotes the set of complex numbers. is an operator which describes the process of random undersampling in the kspace. Given an observation vector , the reconstruction problem is to find out the corresponding , considering to be a nonzero vector. We choose to find the solution to this reconstruction problem using GAN model.A GAN model comprises of a generator and a discriminator , where the generator tries to fool the discriminator by transforming input vector to the distribution of true data . On the other hand, the discriminator attempts to distinguish samples of from generated samples .
We incorporate the conditional GAN (cGAN) based framework [14] in our study. The model is conditioned on the aliased zerofilled reconstruction (ZFR) (size ), given by , where denotes the Hermitian operator. Instead of using a binary crossentropy based adversarial loss for training the cGAN model, we use the Wasserstein loss. This helps in stabilizing the training process of standard GANs, which suffer from saturation resulting in vanishing gradients. Mathematically, the cGAN model with the Wasserstein loss solves the following optimization problem:
(2) 
where denotes the expectation over a batch of images.
Fig. 1 shows the generator architecture of the proposed model, which is inspired from [21]. The architecture is based on a Unet [18], which consists of several encoders and corresponding decoders. Each encoder is in the form of a convolutional layer, which decreases the size and increases the number of feature maps. Each decoder consists of a transposed convolutional layer, to increase the size of the feature maps. In order to transfer the features of a particular size from the encoder to the corresponding decoder, skip connections are present. Instead of obtaining feature maps of size lower than using more encoders (and decoders), the proposed architecture consists of RRDBs at the bottom of the Unet. Each RRDB consists of dense blocks, as well as residual connections at two levels: across each dense block, and across all the dense blocks in one RRDB, as shown in Fig. 1. The output of each dense block is scaled by
before it is added to the identity mapping. Residual connections make the length of the network variable thereby making identity mappings easier to learn and avoid vanishing gradients in the shallower layers. Dense connections allow the transfer of feature maps to deeper layers, thus increasing the variety of accessible information. Throughout this network, batch normalization (BN) and leaky rectified linear unit (ReLU) activation are applied after each convolutional layer. At the output, a hyperbolic tangent activation is used.
The discriminator is a convolutional neural network with 11 layers. Each layer consists of a convolutional layer, followed by BN and leaky ReLU activation. In order to improve the reconstruction of the high frequency details, a patchGAN framework is incorporated, which tries to score each patch of the image separately, and then gives the average score as the final output.
In order to reduce the pixelwise difference between the generated image and the corresponding ground truth (GT) image, a mean absolute error (MAE) based loss is incorporated while training the generator. It is given by:
(3) 
where denotes the norm. Since the human vision system is sensitive to structural distortions in images, it is important to preserve the structural information in MR images, which is crucial for clinical analysis. In order to improve the reconstruction of fine textural details in the images, a mean SSIM (MSSIM) [23] based loss is also incorporated, as follows:
(4) 
where is the number of patches in the image, and SSIM is calculated as follows:
(5) 
where and represent two patches, and and
denote the mean and variance, respectively.
and are small constants to avoid division by zero.The overall loss for training the generator is given by:
(6) 
where , , and are the weighting factors for various loss terms.
3 Results and Discussion
3.1 Training settings
In this work, a 1D Gaussian mask is used for undersampling the kspace. Since the ZFR is complex valued, the real and imaginary components are concatenated and passed to the generator in the form of a two channel real valued input. The batch size is set as 32. The discriminator is updated three times before every generator update. The threshold for weight clipping is 0.05. The growth rate for the dense blocks is set as 32, is 0.2, and 12 RRDBs are used. The number of filters in the last layer of each RRDB is 512. Adam optimizer [9] is used for training with and . The learning rate is set as for the generator and for the discriminator. The weighting factors are , , and
. The model is implemented using Keras framework with TensorFlow backend. For training, 4 NVIDIA GeForce GTX 1080 Ti GPUs are used, each having 11 GB RAM.
3.2 Data details
For the purpose of training and testing, T1 weighted MR images of brain from the MICCAI 2013 grand challenge dataset [10] are used. In order to make the reconstructed output robust to noise, data augmentation is carried out using images with 10% and 20% additive Gaussian noise. To make the set of training images, 19 797 images are randomly taken from the training set of the aforementioned dataset. Out of these, noise is added to 6335 images, while the remaining 13 462 images are used without any noise. In addition, 990 images are chosen from the 13 462 noisefree images, and noise is added to them also, to get a total of 20 787 images for training. Among the noisy images, number of images with 10% and 20% noise is equal. Thus, the set contains 64.76% noisefree images, 30.48% noisy images whose corresponding noisefree images are not present in the training set, and 4.76% noisy images whose corresponding noisefree images are present in the training set. For testing purposes, 2000 images are chosen randomly from the test set of the dataset. The tests are conducted in three stages: using noisefree images, using images with 10% noise added to them, and using images with 20% noise.
3.3 Results
Table 1 summarizes the quantitative results to study the effect of addition of various components to the model. These results are reported for images in which 20% of the raw kspace samples are retained. For all the four cases, the generator is trained with
. In the first case, the GAN model comprises of a Unet generator and a patchGAN discriminator, with BN present throughout the network. In the subsequent cases, the use of RRDBs, and addition of BN to RRDBs results in significant improvement in peak signal to noise ratio (PSNR). The use of data augmentation with noisy images, in the fourth case, results in significantly better quantitative results for the reconstruction of noisy images, as compared to the previous three cases.
Network Settings  

Unet+PatchGAN  ✓  ✓  ✓  ✓ 
RRDBs  ✗  ✓  ✓  ✓ 
BN in RRDBs  ✗  ✗  ✓  ✓ 
Data Augmentation  ✗  ✗  ✗  ✓ 
Images  PSNR (dB) / MSSIM  
Noisefree  40.45 / 0.9865  41.39 / 0.9810  41.88 / 0.9829  42.31 / 0.9841 
10% noise added  38.25 / 0.9641  38.03 / 0.9624  38.03 / 0.9620  39.80 / 0.9751 
20% noise added  33.98 / 0.9217  34.01 / 0.9210  33.78 / 0.9180  37.56 / 0.9619 
The qualitative results of the proposed method are shown in Fig. 2(be) for 20% undersampling and Fig. 2(fi) for 30% undersampling. It can be seen that the proposed method is able to reconstruct the structural content in the image, including many fine details, successfully. This is also indicated by the quantitative results shown in Table 1. Also, the contrast of the reconstructed image looks very similar to that of the GT. The reconstruction results for noisy inputs, as well as their differences with the corresponding GT, indicate the robustness of the model.
Method  Noisefree images  10% noise added  20% noise added 

PSNR (dB) / MSSIM  PSNR (dB)  PSNR (dB)  
ZFR*  35.61 / 0.7735  24.42  18.56 
DLMRI[17]*  38.24 / 0.8020  24.52  18.85 
Noiselet[15]*  39.01 / 0.8588  25.29  19.42 
BM3D[5]*  39.93 / 0.9125  24.71  18.65 
DeepADMM[22]*  37.36 / 0.8534  25.19  19.33 
DAGAN[21]*  40.20 / 0.9681  37.40  34.23 
Proposed  46.88 / 0.9943  42.34  39.49 
*PSNR and MSSIM values are taken from [21].
Table 2 shows the comparison of the proposed method with ZFR and some stateoftheart methods like DLMRI [17], Noiselet [15], BM3D [5], DeepADMM [22], and DAGAN [21]. These results are reported for images in which 30% of the kspace data is retained. It can be seen that both the PSNR and MSSIM for the proposed method are significantly better than the previous methods. The comparison between the PSNR of reconstruction results for noisy images shows that the proposed method is highly robust to noise. Moreover, the reconstruction time of the proposed method is 9.06 ms per image on a GPU, which can facilitate realtime reconstruction of MR images.
We also tested the model trained on MR images of the brain to reconstruct MR images of canine legs from the MICCAI 2013 challenge. Fig. 3 shows the results of this zeroshot inference for images in which 20% of the kspace data is retained. Though no images of canine legs were used for training, the model is able to faithfully reconstruct most of the structural content, and is able to achieve average PSNR and MSSIM values of 41.28 dB and 0.9788, respectively, for 2000 test images.
4 Conclusion
In this paper, a novel GAN based framework has been utilized for CSMRI reconstruction. The use of RRDBs in a Unet based generator architecture increases the amount of information available. In order to preserve the high frequency content as well as the structural details in the reconstructed output, a patchGAN discriminator and SSIM based loss have been incorporated. The use of noisy images during training makes the reconstruction results highly robust to noise. The proposed method is able to outperform the stateoftheart methods, while maintaining the feasibility of realtime reconstruction. In future, we plan to analyze the performance of the proposed model for different kspace sampling patterns. In order to improve the reconstruction time, we plan to work on lightweight architectures. Further work may be carried out on devising regularization terms that help to preserve the finest of details in the reconstructed output.
References

[1]
(201706–11 Aug)
Wasserstein generative adversarial networks.
In
Proceedings of the 34th International Conference on Machine Learning
, Vol. 70, International Convention Centre, Sydney, Australia, pp. 214–223. External Links: Link Cited by: 4th item.  [2] (2017) Compressed sensing using generative models. In Proceedings of the 34th International Conference on Machine Learning, Vol. 70, pp. 537–546. External Links: Link Cited by: §1.
 [3] (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 3 (1), pp. 1–122. External Links: Link, Document, ISSN 19358237 Cited by: §1.
 [4] (200604) Compressed sensing. IEEE Transactions on Information Theory 52 (4), pp. 1289–1306. External Links: Document, ISSN Cited by: §1.
 [5] (20161101) Decoupled algorithm for MRI reconstruction using nonlocal block matching model: BM3DMRI. Journal of Mathematical Imaging and Vision 56 (3), pp. 430–440. External Links: ISSN 15737683, Document, Link Cited by: §3.3, Table 2.
 [6] (2014) Generative adversarial nets. In Advances in Neural Information Processing Systems 27, pp. 2672–2680. External Links: Link Cited by: §1.
 [7] (201011) Compressive sensing principles and iterative sparse recovery for inverse and illposed problems. Inverse Problems 26 (12), pp. 125012. External Links: Document, Link Cited by: §1.

[8]
(201707)
Imagetoimage translation with conditional adversarial networks.
In
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
, Vol. , pp. 5967–5976. External Links: Document, ISSN Cited by: 2nd item.  [9] (2015) Adam: A method for stochastic optimization. See DBLP:conf/iclr/2015, External Links: Link Cited by: §3.1.
 [10] 2013 Diencephalon standard challenge. Cited by: §3.2.
 [11] (2015) Balanced sparse model for tight frames in compressed sensing magnetic resonance imaging. PLOS ONE 10 (4), pp. 1–19. External Links: Link, Document Cited by: §1.
 [12] (2007) Sparse MRI: the application of compressed sensing for rapid MR imaging. Magnetic Resonance in Medicine 58 (6), pp. 1182–1195. External Links: Document, Link, https://onlinelibrary.wiley.com/doi/pdf/10.1002/mrm.21391 Cited by: §1.
 [13] (201901) Deep generative adversarial neural networks for compressive sensing MRI. IEEE Transactions on Medical Imaging 38 (1), pp. 167–179. External Links: Document, ISSN Cited by: §1.
 [14] (2014) Conditional generative adversarial nets. ArXiv abs/1411.1784. Cited by: §2.
 [15] (2015) Multichannel compressive sensing MRI using noiselet encoding. PLOS ONE 10 (5), pp. 1–27. External Links: Link, Document Cited by: §3.3, Table 2.
 [16] (2010) Iterative thresholding compressed sensing mri based on contourlet transform. Inverse Problems in Science and Engineering 18 (6), pp. 737–758. External Links: Document, Link, https://doi.org/10.1080/17415977.2010.492509 Cited by: §1.
 [17] (201105) MR image reconstruction from highly undersampled kspace data by dictionary learning. IEEE Transactions on Medical Imaging 30 (5), pp. 1028–1041. External Links: Document, ISSN Cited by: §1, §3.3, Table 2.
 [18] (2015) Unet Convolutional networks for biomedical image segmentation. ArXiv abs/1505.04597. Cited by: 1st item, §2.
 [19] (201809) ESRGAN: enhanced superresolution generative adversarial networks. In The European Conference on Computer Vision (ECCV) Workshops, pp. 63–79. Cited by: 1st item.
 [20] (201905) Fast compressive sensing recovery using generative models with structured latent variables. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vol. , pp. 2967–2971. External Links: Document, ISSN Cited by: §1.
 [21] (201806) DAGAN: Deep dealiasing generative adversarial networks for fast compressed sensing MRI reconstruction. IEEE Transactions on Medical Imaging 37 (6), pp. 1310–1321. External Links: Document, ISSN Cited by: §1, §2, §3.3, Table 2.
 [22] (2016) Deep ADMMNet for compressive sensing MRI. In Advances in Neural Information Processing Systems 29, pp. 10–18. External Links: Link Cited by: §1, §3.3, Table 2.
 [23] (200404) Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13 (4), pp. 600–612. External Links: Document, ISSN Cited by: 3rd item, §2.
 [24] (201703) Loss functions for image restoration with neural networks. IEEE Transactions on Computational Imaging 3 (1), pp. 47–57. External Links: Document, ISSN Cited by: 3rd item.