I Introduction
Xray computed tomography (CT) is one of the most important imaging modalities in modern hospitals and clinics. However, there is a potential radiation risk to the patient, since xrays could cause genetic damage and induce cancer in a probability related to the radiation dose
[1, 2]. Lowering the radiation dose increases the noise and artifacts in reconstructed images, which can compromise diagnostic information. Hence, extensive efforts have been made to design better image reconstruction or image processing methods for lowdose CT (LDCT). These methods generally fall into three categories: (a) sinogram filtration before reconstruction [3, 4, 5], (b) iterative reconstruction [6, 7], and (c) image postprocessing after reconstruction [8, 9, 10].Over the past decade, researchers were dedicated to developing new iterative algorithms (IR) for LDCT image reconstruction. Generally, those algorithms optimize an objective function that incorporates an accurate system model [11, 12], a statistical noise model [13, 14, 15] and prior information in the image domain. Popular image priors include total variation (TV) and its variants [16, 17, 18], as well as dictionary learning [19, 20]. These iterative reconstruction algorithms greatly improved image quality but they may still lose some details and suffer from remaining artifacts. Also, they require a high computational cost, which is a bottleneck in practical applications.
On the other hand, sinogram prefiltration and image postprocessing are computationally efficient compared to iterative reconstruction. Noise characteristic was well modeled in the sinogram domain for sinogramdomain filtration. However, sinogram data of commercial scanners are not readily available to users, and these methods may suffer from resolution loss and edge blurring. Sinogram data need to be carefully processed, otherwise artifacts may be induced in the reconstructed images.
Differently from sinogram denoising, image postprocessing directly operates on an image. Many efforts were made• in the image domain to reduce LDCT noise and suppress artifacts. For example, the nonlocal means (NLM) method was adapted for CT image denoising [8]. Inspired by compressed sensing methods, an adapted KSVD method was proposed [9] to reduce artifacts in CT images. The blockmatching 3D (BM3D) algorithm was used for image restoration in several CT imaging tasks [10, 21]
. With such image postprocessing, image quality improvement was clear but oversmoothing and/or residual errors were often observed in the processed images. These issues are difficult to address, given the nonuniform distribution of CT image noise.
The recent explosive development of deep neural networks suggests new thinking and huge potential for the medical imaging field
[22, 23]. As an example, the LDCT denoising problem can be solved using deep learning techniques. Specifically, the convolutional neural network (CNN) for image superresolution
[24] was recently adapted for lowdose CT image denoising [25], with a significant performance gain. Then, more complex networks were proposed to handle the LDCT denoising problem such as the REDCNN in [26] and the wavelet network in [27]. The wavelet network adopted the shortcut connections introducted by the Unet [28] directly and the REDCNN [27] replaced the pooling/unpooling layers of Unet with convolution/deconvolution pairs.Despite the impressive denoising results with these innovative network structures, they fall into a category of an endtoend network that typically uses the mean squared error (MSE) between the network output and the ground truth as the loss function. As revealed by the recent work
[29, 30], this perpixel MSE is often associated with oversmoothed edges and loss of details. As an algorithm tries to minimize perpixel MSE, it overlooks subtle image textures/signatures critical for human perception. It is reasonable to assume that CT images distribute over some manifolds. From that point of view, the MSE based approach tends to take the mean of highresolution patches using the Euclidean distance rather than the geodesic distance. Therefore, in addition to the blurring effect, artifacts are also possible such as nonuniform biases.To tackle the above problems, here we propose to use a generative adversarial network (WGAN) [31] with the Wasserstein distance as the discrepancy measure between distributions and a perceptual loss that computes the difference between images in an established feature space [29, 30].
The use of WGAN is to encourage that denoised CT images share the same distribution as that of normal dose CT (NDCT) images. In the GAN framework, a generative network and a discriminator network are coupled tightly and trained simultaneously. While the network is trained to produce realistic images
from a random vector
, the network is trained to discriminate between real and generated images [32, 33]. GANs have been used in many applications such as single image superresolution [29], art creation [34, 35], and image transformation [36]. In the field of medical imaging, Nie et al. [37]proposed to use GAN to estimate CT image from its corresponding MR image. Wolterink
et al. [38] are the first to apply GAN network for cardiac CT image denoising. And Yu et al. [39] used GAN network to handle the dealising problem for fast CSMRI. Promising results were achieved in these works. We will discuss and compare the results of those two networks in Section III since the proposed network is closely related with their works.Despite its success in these areas, GANs still suffer from a remarkable difficulty in training [33, 40]. In the original GAN [32], and are trained by solving the following minimax problem
(1) 
where denotes the expectation operator; and are the real data distribution and the noisy data distribution. The generator transforms a noisy sample to mimic a real sample, which defines a data distribution, denoted by . When is trained to become an optimal discriminator for a fixed , the minimization search for is equivalent to minimizing the JensenShannon (JS) divergence of and , which will lead to vanished gradient on the generator [40] and will stop updating as the training continues.
Consequently, Arjovsky et al. [31] proposed to use the EarthMover (EM) distance or Wasserstein metric between the generated image samples and real data for GAN, which is referred to as WGAN, because the EM distance is continuous and differentiable almost everywhere under some mild assumptions while neither KL nor JS divergence is. After that, an improved WGAN with gradient penalty was proposed [41] to accelerate the convergence.
The rationale behind the perceptual loss is twofold. First, when a person compares two images, the perception is not performed pixelbypixel. Human vision actually extracts and compares features from images [42]. Therefore, instead of using pixelwise MSE, we employ another pretrained deep CNN (the famous VGG [43]
) for feature extraction and compare the denoised output against the ground truth in terms of the extracted features. Second, from a mathematical point of view, CT images are not uniformly distributed in a highdimensional Euclidean space. They reside more likely in a lowdimensional manifold. With MSE, we are not measuring the intrinsic similarity between the images, but just their superficial differences in the bruteforce Euclidean distance. By comparing images according their intrinsic structures, we should project them onto a manifold and calculate the geodesic distance instead. Therefore, the use of the perceptual loss for WGAN should facilitate producing results with not only lower noise but also sharper details.
In particular, we treat the LDCT denoising problem as a transformation from LDCT to NDCT images. WGAN provides a good distance estimation between the denoised LDCT and NDCT image distributions. Meanwhile, the VGGbased perceptual loss tends to keep the image content after denoising. The rest of this paper is organized as follows. The proposed method is described in Section II. The experiments and results are presented in Section III. Finally, relevant issues are discussed and a conclusion is drawn in Section IV.
Ii Methods
Iia Noise Reduction Model
Let denote a LDCT image and denote the corresponding NDCT image. The goal of the denoising process is to seek a function that maps LDCT to NDCT :
(2) 
On the other hand, we can also take as a sample from the LDCT image distribution and from the NDCT distribution or the real distribution . The denoising function maps samples from into a certain distribution . By varying the function , we aim to change to make it close to . In this way, we treat the denoising operator as moving one data distribution to another.
Typically, noise in xray photon measurements can be simply modeled as the combination of Poisson quantum noise and Gaussian electronic noise. On the contrary, in the reconstructed images, the noise model is usually complicated and nonuniformly distributed across the whole image. Thus there is no clear clue that indicates how data distributions of NDCT and LDCT images are related to each other, which makes it difficult to denoise LDCT images using traditional methods. However, this uncertainty of noise model can be ignored in deep learning denoising because a deep neural network itself can efficiently learn highlevel features and a representation of data distribution from modest sized image patches through a neural network.
IiB Wgan
Compared to the original GAN network, WGAN uses the Wasserstein distance instead of the JS divergence to compare data distributions. It solves the following minimax problem to obtain both and [41]:
(3) 
where the first two terms perform a Wasserstein distance estimation; the last term is the gradient penalty term for network regularization; is uniformly sampled along straight lines connecting pairs of generated and real samples; and is a constant weighting parameter. Compared to the original GAN, WGAN removes the function in the losses and also drops the last sigmoid layer in the implementation of the discriminator . Specifically, the networks and are trained alternatively by fixing one and updating the other.
IiC Perceptual Loss
While the WGAN network encourages that the generator transforms the data distribution from high noise to a low noise version, another part of the loss function is added for the network to keep image details or information content. Typically, a mean squared error (MSE) loss function is used, which tries to minimize the pixelwise error between a denoised patch and a NDCT image patch as [25, 26]
(4) 
where
denotes the Frobenius norm. However, the MSE loss can potentially generate blurry images and cause the distortion or loss of details. Thus, instead of using a MSE measure, we apply a perceptual loss function defined in a feature space
(5) 
where is a feature extractor, and , , and stand for the width, height and depth of the feature space, respectively. In our implementation, we adopt the wellknown pretrained VGG19 network [43] as the feature extractor. Since the pretrained VGG network takes color images as input while CT images are in grayscale, we duplicated the CT images to make RGB channels before they are fed into the VGG network. The VGG19 network contains 16 convolutional layers followed by 3 fullyconnected layers. The output of the 16th convolutional layer is the feature extracted by the VGG network and used in the perceptual loss function,
(6) 
For convenience, we call the perceptual loss computed by VGG network VGG loss.
IiD Network Structures
The overall view of the proposed network structure is shown in Fig. 1. For convenience, we name this network WGANVGG. It consists three parts. The first part is the generator , which is a convolutional neural network (CNN) of 8 convolutional layers. Following the common practice in the deep learning community [44], small kernels were used in each convolutional layer. Due to the stacking structure, such a network can cover a large enough receptive field efficiently. Each of the first 7 hidden layers of have 32 filters. The last layer generates only one feature map with a single filter, which is also the output of
. We use Rectified Linear Unit (ReLU) as the activation function.
The second part of the network is the perceptual loss calculator, which is realized by the pretrained VGG network [43]. A denoised output image from the generator and the ground truth image are fed into the pretrained VGG network for feature extraction. Then, the objective loss is computed using the extracted features from a specified layer according to Eq. (6). The reconstruction error is then backpropagated to update the weights of only, while keeping the VGG parameters intact.
The third part of the network is the discriminator . As shown in Fig. 2, has 6 convolutional layers with the structure inspired by others’ work [43, 29, 30]. The first two convolutional layers have 64 filters, then followed by two convolutional layers of 128 filters, and the last two convolutional layers have 256 filters. Following the same logic as in , all the convolutional layers in have a small kernel size. After the six convolutional layers, there are two fullyconnected layers, of which the first has 1024 outputs and the other has a single output. Following the practice in [31], there is no sigmoid cross entropy layer at the end of .
The network is trained using image patches and applied on entire images. The details are provided in Section III on experiments.
IiE Other Networks
For comparison, we also trained four other networks.

CNNMSE with only MSE loss

CNNVGG with only VGG loss

WGANMSE with MSE loss in the WGAN framework

WGAN with no other additive losses

Original GAN
All the trained networks are summarized in Table. I.
Network  Loss 

CNNMSE  
WGANMSE  
CNNVGG  
WGANVGG  
WGAN  
GAN 
Iii Experiments
Iiia Experimental Datasets
We used a real clinical dataset authorized for “the 2016 NIHAAPMMayo Clinic Low Dose CT Grand Challenge” by Mayo Clinic for the training and evaluation of the proposed networks [45]. The dataset contains 10 anonymous patients’ normaldose abdominal CT images and simulated quarterdose CT images. In our experiments, we randomly extracted 100,096 pairs of image patches from 4,000 CT images as our training inputs and labels. The patch size is . Also, we extracted 5,056 pairs of patches from another 2,000 images for validation. When choosing the image patches, we excluded image patches that were mostly air. For comparison, we implemented a stateoftheart 3D dictionary learning reconstruction technique as a representative IR algorithm [19, 20]. The dictionary learning reconstruction was performed from the LDCT projection data provided by Mayo Clinic.
IiiB Network Training
In our experiments, all the networks were optimized using Adam algorithm [46]. The optimization procedure for WGANVGG network is shown in Fig. 3. The minibatch size was 128. The hyperparameters for Adam were set as , and we chose as suggested in [41], according to our experimental experience. The optimization processes for WGANMSE and WGAN are similar except that line 12 was changed to the corresponding loss function, and for CNNMSE and CNNVGG, lines 210 were removed and line 12 was changed according to their loss functions.
The networks were implemented in Python with the Tensorflow library
[47]. A NVIDIA Titan XP GPU was used in this study.IiiC Network Convergence
To visualize the convergence of the networks, we calculated the MSE loss and VGG loss over the 5,056 image patches for validation according to Eqs. (4) and (6) after each epoch. Fig. 7 shows the averaged MSE and VGG losses respectively versus the number of epochs for the five networks. Even though these two loss functions were not used at the same time for a given network, we still want to see how their values change during the training. In the two figures, both the MSE and VGG losses decreased initially, which indicates that the two metrics are positively correlated. However, the loss values of the networks in terms of MSE are increasing in the following order, CNNMSEWGANMSEWGANVGGCNNVGG (Fig. (a)a), yet the VGG loss are in the opposite order (Fig. (b)b). The MSE and VGG losses of GAN network are oscillating in the converging process. WGANVGG and CNNVGG have very close VGG loss values, while their MSE losses are quite different. On the other hand, WGAN perturbed the convergence as measured by MSE but smoothly converged in terms of VGG loss. These observations suggest that the two metrics have different focuses when being used by the networks. The difference between MSE and VGG losses will be further revealed in the output images of the generators.
In order to show the convergence of WGAN part, we plotted the estimated Wasserstein values defined as in Eq. (3). It can be observed in Fig. 4(c) that increasing the number of epochs did reduce the Wdistance, although the decay rate becomes smaller. For the WGANVGG curve, the introduction of VGG loss has helped to improve the perception/visibility at a cost of a compromised loss measure. For the WGAN and WGANMSE curves, we would like to note that what we computed is a surrogate for the Wdistance which has not been normalized by the total number of pixels, and if we had done such a normalization the curves would have gone down closely to zero after 100 epochs.
IiiD Denoising Results
To show the denoising effect of the selected networks, we took two representative slices as shown in Figs. 17 and 37. And Figs. 27 and 47 are the zoomed regionsofinterest (ROIs) marked by the red rectangles in Figs. 17 and 37. All the networks demonstrated certain denoising capabilities. However, CNNMSE blurred the images and introduced waxy artifacts as expected, which are easily observed in the zoomed ROIs in Figs. (e)e and (e)e. WGANMSE was able to improve the result of CNNMSE by avoiding oversmooth but minor streak artifacts can still be observed especially compared to CNNVGG and WGANVGG. Meanwhile, using WGAN or GAN alone generated stronger noise (Figs. (g)g and (g)g) than the other networks enhanced a few white structures in the WGAN/GAN generated images, which are originated from the low dose streak artifact in LDCT images, while on the contrary the CNNVGG and WGANVGG images are visually more similar to the NDCT images. This is because the VGG loss used in CNNVGG and WGANVGG is computed in a feature space that is trained previously on a very large natural image dataset [48]. By using VGG loss, we transferred the knowledge of human perception that is embedded in VGG network to CT image quality evaluation. The performance of using WGAN or GAN alone is not acceptable because it only maps the data distribution from LDCT to NDCT but does not guarantee the image content correspondence. As for the lesion detection in these two slices, all the networks enhance the lesion visibility compared to the original noisy low dose FBP images as noise is reduced by the different approaches.
As for iterative reconstruction technique, the reconstruction results depend greatly on the choices of the regularization parameters. The implemented dictionary learning reconstruction (DictRecon) result gave the most aggressive noise reduction effect compared to the network outputs as a result of strong regularization. However, it oversmoothed some fine structures. For example, in Fig. 47, the vessel pointed by the green arrow was smeared out while it is easily identifiable in NDCT as well as WGANVGG images. Yet, as an iterative reconstruction method, DictRecon has its advantage over postprocessing method. As pointed by the red arrow in Fig 47, there is a bright spot which can be seen in DictRecon and NDCT images, but is not observable in LDCT and network processed images. Since the WGANVGG image is generated from LDCT image, in which this bright spot is not easily observed, it is reasonable that we do not see the bright spot in the images processed by neural networks. In other words, we do not want the network to generate structure that does not exist in the original images. In short, the proposed WGANVGG network is a postprocessing method and information that is lost during the FBP reconstruction cannot easily be recovered, which is one limitation for all the postprocessing methods. On the other hand, as an iterative reconstruction method, DictRecon algorithm generates images from raw data, which has more information than the postprocessing methods.
Fig. 17  Fig. 37  

PSNR  SSIM  PSNR  SSIM  
LDCT  19.7904  0.7496  18.4519  0.6471 
CNNMSE  24.4894  0.7966  23.2649  0.7022 
WGANMSE  24.0637  0.8090  22.7255  0.7122 
CNNVGG  23.2322  0.7926  22.0950  0.6972 
WGANVGG  23.3942  0.7923  22.1620  0.6949 
WGAN  22.0168  0.7745  20.9051  0.6759 
‘1 GAN  21.8676  0.7581  21.0042  0.6632 
DictRecon  24.2516  0.8148  24.0992  0.7631 
Fig. 17  Fig. 37  

Mean  SD  Mean  SD  
NDCT  9  36  118  38 
LDCT  11  74  118  66 
CNNMSE  12  18  120  15 
WGANMSE  9  28  115  25 
CNNVGG  4  30  104  28 
WGANVGG  9  31  111  29 
WGAN  23  37  135  33 
GAN  8  35  110  32 
DictRecon  4  11  111  13 
IiiE Quantitative Analysis
For quantitative analysis, we calculated the peaktonoise ratio (PSNR) and structural similarity (SSIM). The summary data are in Table II. CNNMSE ranks the first in terms of PSNR, while WGAN is the worst. Since PSNR is equivalent to the perpixel loss, it is not surprising that CNNMSE, which was trained to minimize MSE loss, outperformed the networks trained to minimize other featurebased loss. It is worth noting that these quantitative results are in decent agreement with Fig. 7, in which CNNMSE has the smallest MSE loss and WGAN has the largest. The reason why WGAN ranks the worst in PSNR and SSIM is because it does not include either MSE or VGG regularization. DictRecon achieves the best SSIM and a high PSNR. However, it has the problem of image blurring and leads to blocky and waxy artifacts in the resultant images. This indicates that PSNR and SSIM may not be sufficient in evaluating image quality.
In the reviewing process, we found two papers using similar network structures. In [38], Wolterink et al. trained three networks, i.e. GAN, CNNMSE, and GANMSE for cardiac CT denoising. Their quantitative PSNR results are consistent with our counterpart results. And Yu et al. [39] used GANVGG to handle the dealising problem for fast CSMRI. Their results are also consistent with ours. Interestingly, despite the high PSNRs obtained by MSEbased networks, the authors in the two papers all claim that GAN and VGG loss based networks have better image quality and diagnostic information.
To gain more insight into the output images from different approaches, we inspect the statistical properties by calculating the mean CT numbers (Hounsfield Units) and standard deviations (SDs) of two flat regions in Figs.
17 and 37 (marked by the blue rectangles). In an ideal scenario, a noise reduction algorithm should achieve mean and SD to the gold standard as close as possible. In our experiments, the NDCT FBP images were used as gold standard because they have the best image quality in this dataset. As shown in Table III, Both CNNMSE and DictRecon produced much smaller SDs compared to NDCT, which indicates they oversmoothed the images and supports our visual observation. On the contrary, WGAN produced the closest SDs yet smaller mean values, which means it can reduce noise to the same level as NDCT but it compromised the information content. On the other hand, the proposed WGANVGG has outperformed CNNVGG, WGANMSE and other selected methods in terms of mean CT numbers, SDs, and most importantly visual impression.In addition, we performed a blind reader study on 10 groups of images. Each group contains the same image slice but processed by different methods. NDCT and LDCT images are also included for reference, which are the only two labeled images in each group. Two radiologists were asked to independently score each image in terms of noise suppression and artifact reduction on a fivepoint scale (1 = unacceptable and 5 = excellent), except for the NDCT and LDCT images, which are the references. In addition, they were asked to give an overall image quality score for all the images. The mean and standard deviation values of the scores from the two radiologists were then obtained as the final evaluation results, which are shown in Table. IV. It can be seen that CNNMSE and DictRecon give the best noise suppression scores while the proposed WGANVGG outperforms the other methods for artifact reduction and overall quality improvement. Also, *VGG networks provide higher scores than *MSE networks in terms of artifact reduction and overall quality but lower scores for noise suppression. This indicates that MSE loss based networks are good at noise suppression at a loss of image details, resulting in an image quality degradation for diagnosis. Meanwhile, the networks using WGAN give better overall image quality than the networks using CNN, which supports the use of WGAN for CT image denoising.
IiiF VGG Feature Extractor
Since VGG network is trained on natural images, it may cause concerns on how well it performs on CT image feature extraction. Thus, we displayed two feature maps of normal dose and quarter dose images and their absolute difference in Fig. 51. The feature map contains 512 small images of size . We organize these small images into a array. Each small image emphasizes a feature of the original CT image, i.e. boundaries, edges, or whole structures. Thus, we believe VGG network can also serve a good feature extractor for CT images.
Iv Discussions and Conclusion
The most important motivation for this paper is to approach the gold standard NDCT images as much as possible. As described above, the feasibility and merits of GAN has been investigated for this purpose with the Wasserstein distance and the VGG loss. The difference between using the MSE and VGG losses is rather significant. Despite the fact that networks with MSE would offer higher values for traditional figures of merit, VGG loss based networks seem desirable for better visual image quality with more details and less artifacts.
The experimental results have demonstrated that using WGAN helps improve image quality and statistical properties. Comparing the images of CNNMSE and WGANMSE, we can see that the WGAN framework helped to avoid oversmoothing effect typically suffered by MSE based image generators. Although CNNVGG and WGANVGG visually share a similar result, the quantitative analysis shows WGANVGG enjoys higher PSNRs and more faithful statistical properties of denoised images relative to those of NDCT images. However, using WGAN/GAN alone reduced noise but at the expense of losing critical features. The resultant images do not show a strong noise reduction. Quantitatively, the associated PSNR and SSIM increased modestly compared to LDCT but they are much lower than what the other networks produced. Theoretically, WGAN/GAN network is based on generative model and may generate images that look naturally yet cause a severe distortion for medical diagnostics. This is why an additive loss function such as MSE and VGG loss should be added to guarantee the image content remains the same.
It should be noted that the experimental data contain only one noise setting. Networks should be retrained or retuned for different data to adapt for different noise properties. Especially, networks with WGAN are trying to minimize the distance between two probability distributions. Thus, their trained parameters have to be adjusted for new datasets. Meanwhile, since the loss function of WGANVGG is a mixture of feature domain distance and the GAN adversarial loss, they should be carefully balanced for different dataset to reduce the amount of image content alternation.
The denoising network is a typical endtoend operation, in which the input is a LDCT image while the target is a NDCT image. Although we have generated images visually similar to NDCT counterparts in the WGANVGG network, we recognize that these generated images are still not as good as NDCT images. Moreover, noise still exists in NDCT images. Thus, it is possible that VGG network has captured these noise features and kept them in the denoised images. This could be a common problem for all the denoising networks. How to outperform the socalled gold standard NDCT images is an interesting open question. Moreover, image postdenoising methods also suffer from the information loss during the FBP reconstruction process. This phenomena is observed in the comparison with DictRecon result. A better way to incorporate the strong fitting capability of neural network and the data completeness of CT data is to design a network that maps directly from raw projection to the final CT images, which could be a next step of our work.
In conclusion, we have proposed a contemporary deep neural network that uses a WGAN framework with perceptual loss function for LDCT image denoising. Instead of focusing on the design of a complex network structure, we have dedicated our effort to combine synergistic loss functions that guide the denoising process so that the resultant denoised results are as close to the gold standard as possible. Our experiment results with real clinical images have shown that the proposed WGANVGG network can effectively solve the wellknown oversmoothing problem and generate images with reduced noise and increased contrast for improved lesion detection. In the future, we plan to incorporate the WGANVGG network with more complicated generators such as the networks reported in [26, 27] and extend these networks for image reconstruction from raw data by making a neural network counterpart of the FBP process.
References
 [1] D. J. Brenner and E. J. Hall, “Computed tomography — an increasing source of radiation exposure,” New England J. Med., vol. 357, no. 22, pp. 2277–2284, Nov. 2007.
 [2] A. B. De Gonzalez and S. Darby, “Risk of cancer from diagnostic xrays: estimates for the UK and 14 other countries,” The Lancet, vol. 363, no. 9406, pp. 345–351, Jan. 2004.

[3]
J. Wang, H. Lu, T. Li, and Z. Liang, “Sinogram noise reduction for lowdose CT by statisticsbased nonlinear filters,” in
Med. Imag. 2005: Image Process., vol. 5747. International Society for Optics and Photonics, Apr. 2005, pp. 2058–2067.  [4] J. Wang, T. Li, H. Lu, and Z. Liang, “Penalized weighted leastsquares approach to sinogram noise reduction and image reconstruction for lowdose xray computed tomography,” IEEE Trans. Med. Imag., vol. 25, no. 10, pp. 1272–1283, Oct. 2006.
 [5] A. Manduca, L. Yu, J. D. Trzasko, N. Khaylova, J. M. Kofler, C. M. McCollough, and J. G. Fletcher, “Projection space denoising with bilateral filtering and CT noise modeling for dose reduction in CT,” Med. Phys., vol. 36, no. 11, pp. 4911–4919, Nov. 2009.
 [6] M. Beister, D. Kolditz, and W. A. Kalender, “Iterative reconstruction methods in xray ct,” Physica Medica: Eur. J. Med. Phys., vol. 28, no. 2, pp. 94–108, Apr. 2012.
 [7] A. K. Hara, R. G. Paden, A. C. Silva, J. L. Kujak, H. J. Lawder, and W. Pavlicek, “Iterative reconstruction technique for reducing body radiation dose at CT: Feasibility study,” Am. J. Roentgenol., vol. 193, no. 3, pp. 764–771, Sep. 2009.
 [8] J. Ma, J. Huang, Q. Feng, H. Zhang, H. Lu, Z. Liang, and W. Chen, “Lowdose computed tomography image restoration using previous normaldose scan,” Med. Phys., vol. 38, no. 10, pp. 5713–5731, Oct. 2011.
 [9] Y. Chen, X. Yin, L. Shi, H. Shu, L. Luo, J.L. Coatrieux, and C. Toumoulin, “Improving abdomen tumor lowdose CT images using a fast dictionary learning based processing,” Phys. Med. Biol., vol. 58, no. 16, p. 5803, Aug. 2013.
 [10] P. F. Feruglio, C. Vinegoni, J. Gros, A. Sbarbati, and R. Weissleder, “Block matching 3D random noise filtering for absorption optical projection tomography,” Phys. Med. Biol., vol. 55, no. 18, p. 5401, Sep. 2010.
 [11] B. De Man and S. Basu, “Distancedriven projection and backprojection in three dimensions,” Phys. Med. Biol., vol. 49, no. 11, p. 2463, May. 2004.
 [12] R. M. Lewitt, “Multidimensional digital image representations using generalized Kaiser–Bessel window functions,” J. Opt. Soc. Amer. A, vol. 7, no. 10, pp. 1834–1846, Oct. 1990.
 [13] B. R. Whiting, P. Massoumzadeh, O. A. Earl, J. A. O’Sullivan, D. L. Snyder, and J. F. Williamson, “Properties of preprocessed sinogram data in xray computed tomography,” Med. Phys., vol. 33, no. 9, pp. 3290–3303, Sep. 2006.
 [14] I. A. Elbakri and J. A. Fessler, “Statistical image reconstruction for polyenergetic xray computed tomography,” IEEE Trans. Med. Imag., vol. 21, no. 2, pp. 89–99, Feb. 2002.
 [15] S. Ramani and J. A. Fessler, “A splittingbased iterative algorithm for accelerated statistical xray CT reconstruction,” IEEE Trans. Med. Imag., vol. 31, no. 3, pp. 677–688, Mar. 2012.
 [16] E. Y. Sidky and X. Pan, “Image reconstruction in circular conebeam computed tomography by constrained, totalvariation minimization,” Phys. Med. Biol., vol. 53, no. 17, p. 4777, Aug. 2008.
 [17] Y. Liu, J. Ma, Y. Fan, and Z. Liang, “Adaptiveweighted total variation minimization for sparse data toward lowdose xray computed tomography image reconstruction,” Phys. Med. Biol., vol. 57, no. 23, p. 7923, Nov. 2012.
 [18] Z. Tian, X. Jia, K. Yuan, T. Pan, and S. B. Jiang, “Lowdose CT reconstruction via edgepreserving total variation regularization,” Phys. Med. Biol., vol. 56, no. 18, p. 5949, Nov. 2011.
 [19] Q. Xu, H. Yu, X. Mou, L. Zhang, J. Hsieh, and G. Wang, “Lowdose xray CT reconstruction via dictionary learning,” IEEE Trans. Med. Imag., vol. 31, no. 9, pp. 1682–1697, Sep. 2012.

[20]
Y. Zhang, X. Mou, G. Wang, and H. Yu, “Tensorbased dictionary learning for spectral CT reconstruction,”
IEEE Trans. Med. Imag., vol. 36, no. 1, pp. 142–154, Jan. 2017.  [21] D. Kang, P. Slomka, R. Nakazato, J. Woo, D. S. Berman, C.C. J. Kuo, and D. Dey, “Image denoising of lowradiation dose coronary CT angiography by an adaptive blockmatching 3D algorithm,” in SPIE Med. Imag. International Society for Optics and Photonics, Mar. 2013, pp. 86 692G–86 692G.

[22]
G. Wang, M. Kalra, and C. G. Orton, “Machine learning will transform radiology significantly within the next 5 years,”
Med. Phys., vol. 44, no. 6, pp. 2041–2044, Mar. 2017.  [23] G. Wang, “A perspective on deep imaging,” IEEE Access, vol. 4, pp. 8914–8924, Nov. 2016.
 [24] C. Dong, C. C. Loy, K. He, and X. Tang, “Image superresolution using deep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 2, pp. 295–307, Feb 2016.
 [25] H. Chen, Y. Zhang, W. Zhang, P. Liao, K. Li, J. Zhou, and G. Wang, “Lowdose CT denoising with convolutional neural network,” 2016. [Online]. Available: Figures:1610.00321
 [26] H. Chen, Y. Zhang, M. K. Kalra, F. Lin, Y. Chen, P. Liao, J. Zhou, and G. Wang, “Lowdose CT with a residual encoderdecoder convolutional neural network,” IEEE Trans. Med. Imag., vol. 36, no. 12, pp. 2524–2535, Dec. 2017.
 [27] E. Kang, J. Min, and J. C. Ye, “A deep convolutional neural network using directional wavelets for lowdose xray CT reconstruction,” 2016. [Online]. Available: arXiv:1610.09736
 [28] O. Ronneberger, P. Fischer, and T. Brox, “Unet: Convolutional networks for biomedical image segmentation,” in Proc. Int. Conf. Med. Image Comput. Comput.Assisted Intervention. Springer, Nov. 2015, pp. 234–241.
 [29] J. Johnson, A. Alahi, and L. FeiFei, “Perceptual losses for realtime style transfer and superresolution,” 2016. [Online]. Available: arXiv:1603.08155
 [30] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, and W. Shi, “Photorealistic single image superresolution using a generative adversarial network,” 2016. [Online]. Available: arXiv:1609.04802
 [31] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein GAN,” 2017. [Online]. Available: arXiv:1701.07875
 [32] I. Goodfellow, J. PougetAbadie, M. Mirza, B. Xu, D. WardeFarley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances Neural Inform. Process. Syst., 2014, pp. 2672–2680.
 [33] I. Goodfellow, “NIPS 2016 tutorial: Generative adversarial networks,” 2017. [Online]. Available: arXiv:1701.00160
 [34] A. Brock, T. Lim, J. M. Ritchie, and N. Weston, “Neural photo editing with introspective adversarial networks,” 2016. [Online]. Available: arXiv:1609.07093

[35]
J.Y. Zhu, P. Krähenbühl, E. Shechtman, and A. A. Efros, “Generative
visual manipulation on the natural image manifold,” in
Eur. Conf. Comput. Vision
. Springer, 2016, pp. 597–613. 
[36]
P. Isola, J.Y. Zhu, T. Zhou, and A. A. Efros, “Imagetoimage translation with conditional adversarial networks,” 2016. [Online]. Available:
arXiv:1611.07004  [37] D. Nie, R. Trullo, C. Petitjean, S. Ruan, and D. Shen, “Medical image synthesis with contextaware generative adversarial networks,” 2016. [Online]. Available: arXiv:1612.05362
 [38] J. M. Wolterink, T. Leiner, M. A. Viergever, and I. Isgum, “Generative adversarial networks for noise reduction in lowdose CT,” IEEE Trans. Med. Imag., Dec. 2017.
 [39] S. Yu, H. Dong, G. Yang, G. Slabaugh, P. L. Dragotti, X. Ye, F. Liu, S. Arridge, J. Keegan, D. Firmin et al., “Deep dealiasing for fast compressive sensing mri,” 2017. [Online]. Available: arXiv:1705.07137
 [40] M. Arjovsky and L. Bottou, “Towards principled methods for training generative adversarial networks,” in NIPS 2016 Workshop on Adversarial Training. In review for ICLR, vol. 2016, 2017.
 [41] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, “Improved training of wasserstein GANs,” 2017. [Online]. Available: arXiv:1704.00028
 [42] M. Nixon and A. S. Aguado, Feature Extraction & Image Process., 2nd ed. Academic Press, 2008.
 [43] K. Simonyan and A. Zisserman, “Very deep convolutional networks for largescale image recognition,” 2014. [Online]. Available: arXiv:1409.1556
 [44] S. Srinivas, R. K. Sarvadevabhatla, K. R. Mopuri, N. Prabhu, S. S. Kruthiventi, and R. V. Babu, “A taxonomy of deep convolutional neural nets for computer vision,” Frontiers Robot. AI, vol. 2, p. 36, 2016.
 [45] AAPM, “Low dose CT grand challenge,” 2017. [Online]. Available: http://www.aapm.org/GrandChallenge/LowDoseCT/#
 [46] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014. [Online]. Available: arXiv:1412.6980
 [47] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin et al., “Tensorflow: Largescale machine learning on heterogeneous distributed systems,” 2016. [Online]. Available: arXiv:1603.04467

[48]
J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and L. FeiFei, “Imagenet: A largescale hierarchical image database,” in
2009 IEEE Conference on Computer Vision and Pattern Recognition, June 2009, pp. 248–255.
Comments
There are no comments yet.