Dual Network Architecture for Few-view CT --Trained on ImageNet Data and Transferred for Medical Imaging

07/02/2019 ∙ by Huidong Xie, et al. ∙ Rensselaer Polytechnic Institute 1

X-ray computed tomography (CT) reconstructs cross-sectional images from projection data. However, ionizing X-ray radiation associated with CT scanning might induce cancer and genetic damage and raises public concerns, and the reduction of radiation dose has attracted major attention. Few-view CT image reconstruction is an important topic to reduce the radiation dose. Recently, data-driven algorithms have shown great potential to solve the few-view CT problem. In this paper, we develop a dual network architecture (DNA) for reconstructing images directly from sinograms. In the proposed DNA method, a point-based fully-connected layer learns the backprojection process requesting significantly less memory than the prior art and with O(C*N*N_c) parameters where N and N_c denote the dimension of reconstructed images and number of projections respectively. C is an adjustable parameter that can be set as low as 1. Our experimental results demonstrate that DNA produces a competitive performance over the other state-of-the-art methods. Interestingly, natural images can be used to pre-train DNA to avoid overfitting when the amount of real patient images is limited.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

page 5

page 6

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Few-view CT is often mentioned in the context of tomographic image reconstruction. Because of the requirement imposed by the Nyquist sampling theorem, reconstructing high-quality CT images from under-sampled data was considered impossible. When sufficient projection data are acquired, analytical methods such as filtered backprojection (FBP)[1] are widely used for clinical CT image reconstruction. In few-view CT circumstance, severe streak artifacts are introduced in these analytically reconstructed images due to the incompleteness of projection data. To overcome this issue, various iterative techniques were proposed in the past decades, which can incorporate prior knowledge in the image reconstruction. Well-known methods include algebraic reconstruction technique (ART) [2], simultaneous algebraic reconstruction technique (SART) [3]

, expectation maximization (EM)

[4]

, etc. Nevertheless, these iterative methods are time-consuming and still not able to produce satisfying results in many cases. Recently, the development of the graphics processing unit (GPU) technology and the availability of big data allow researchers to train deep neural networks in an acceptable amount of time. Therefore, deep learning has become a new frontier of CT reconstruction research

[5, 6, 7].

In the literature, only a few deep learning methods were proposed for reconstructing images directly from raw data. Zhu et al. [8] use fully-connected layers to learn the mapping from raw k-space data to a corresponding reconstructed MRI image. There is no doubt that fully-connected layers can be used to learn the mapping from the sinogram domain to the image domain. However, importing the whole sinograms into the network requires a significant amount of memory and posts a major challenge to train the network for a full-size CT image/volume on a single consumer-level GPU such as an NVIDIA Titan Xp. A recently proposed method, iCT-Net [9] reduces the computational complexity from in [8] to , where and denote the size of a medical image and the number of projections respectively. But one consumer-level GPU is still not able to handle the iCT-Net.

In this study, we propose a dual network architecture (DNA) for CT image reconstruction, which reduces the required parameters from of iCT-Net to , where is an adjustable hyper-parameter much less than , which can be even set as low as 1. Theoretically, the larger the , the better the performance. The proposed network is trainable on one consumer-level GPU such as NVIDIA Titan Xp or NVIDIA 1080 Ti. The proposed DNA is inspired by the FBP formulation to learn a refined filtration backprojection process for reconstructing images directly from sinograms. For X-ray CT, every single point in the sinogram domain only relates to pixels/voxels on an X-ray path through a field of view. With this intuition, the reconstruction process of DNA is learned in a point-wise manner, which is the key ingredient of DNA to alleviate the memory burden. Also, insufficient training dataset is another major issue in deep imaging. Inspired by [8], we first pre-train the network using natural images from the ImageNet [10] and then fine-tune the model using real patient data. To our best knowledge, this is the first work using ImageNet images to pre-train a medical CT image reconstruction network. In the next section, we present a detailed explanation of our proposed DNA network. In the third section, we describe the experimental design, training data and reconstruction results. Finally, in the last section, we discuss relevant issues and conclude the paper.

2 Methods

This section presents the proposed dual network architecture and the objective function.

2.1 Dual network architecture (DNA)

DNA consists of 2 networks, and . The input to the is a batch of few-view sinograms. According to the Fourier slice theorem, low-frequency information is sampled denser than high-frequency information. Therefore, if we perform backprojection directly, reconstructed images will become blurry. Ramp filter is usually used to filter projections and to avoid this blurring effect. In DNA, filtration is performed on the sinogram in the Fourier domain through multiplication with the filter length twice the length of the sinogram (which can be shortened). Then, the filtered projections are fed into the first network . tries to learn a filtered backprojection algorithm and output an intermediate image. Then, further optimizes the output from , generating the final image.

can be divided into three components: filtration, backprojection, and refinement. In the filtration part, 1-D convolutional layers are used to produce filtered data. Theoretically, the length of the filter is infinitely long for a continuous signal, but it is not practical in reality. Filter length is here set as twice the length of a projection vector (which can be further shortened). Since the filtration is done through a multi-layer CNN, different layers can learn different parts of the filter. Therefore, the 1-D convolutional window is empirically set as

the length of the projection vector to reduce the computational burden. The idea of residual connections is used to reserve high-resolution information and to prevent gradient from vanishing.

Next, the learned sinogram from the filtration part is backprojected by . The backprojection part of

is inspired by the following intuition: every point in the filtered projection vector only relates to pixel values on the x-ray path through the corresponding object image and any other data points in this vector contribute nothing to the pixels on this x-ray path. There is no doubt that a single fully-connected layer can be implemented to learn the mapping from the sinogram domain to the image domain, but its memory requirement becomes an issue due to extremely large matrix multiplications in this layer. To reduce the memory burden, the reconstruction process is learned in a point-wise manner using a point-wise fully-connected layer. By doing so, DNA can truly learn the backprojection process. The input to the point-wise fully-connected layer is a single point in the filtered projection vector, and the number of neurons is the width of the corresponding image. After this point-wise fully-connected layer, rotation and summation operations are applied to simulate the analytical FBP method. Bilinear interpolation

[11] is used for rotating images. Moreover, is empirically set as 23, allowing the network to learn multiple mappings from the sinogram domain to the image domain. The number of can be understood as the number of branches. Note that different view-angle uses different parameters. Although the proposed filtration and backprojection parts all together learn a refined FBP method, streak artifacts cannot be eliminated perfectly. An image reconstructed by the backprojection part is fed into the last portion of for refinement.

Refinement part is a typical U-net [12] with conveying paths and is built with the ResNeXt [13] structure. U-net was originally designed for biological images segmentation and had been utilized in various applications. For example, Ref [14, 15, 16] use U-net with conveying paths for CT image denoising, Ref [17, 18] for few-view CT problem and Ref [19]

for compressed sensing MRI, etc. U-net in the DNA contains 4 down-sampling and 4 up-sampling layers, each has a stride of 2 and is followed by a rectified linear unit (ReLU). A

kernel is used in both convolutional and transpose-convolutional layers. The number of kernels in each layer is 36. To maintain the tensor’s size, zero-padding is used.

uses the same structure as the refinement part in . The input to is a concatenation of FBP-result and output from . With the use of , the network becomes deep. As a result, the benefits of deep learning can be utilized in this direct mapping for CT image reconstruction.

2.2 Objective functions

As shown in Fig. 1, DNA is optimized using the Generative Adversarial Network (GAN) framework [20], which is one of the most advanced framework in the field. In this study, the proposed framework contains three components: 2 generator networks and which are introduced in Subsection 2.1, and a discriminator network . and aim at reconstructing images directly from a batch of few-view sinograms. receives images from , or ground-truth dataset, and intends to distinguish whether an image is real (from the ground-truth dataset) or fake (from either or ). Both networks are able to optimize themselves in the training process. If an optimized network can hardly distinguish fake images from real images, then we say that generators and can fool discriminator which is the goal of GAN. By the design, the network also helps to improve the texture of the final image and prevent over-smoothed issue from occurring.

Different from the vanilla generative adversarial network (GAN) [21]

, Wasserstein GAN (WGAN) replaces the cross-entropy loss function with the Wasserstein distance, improving the training stability during the training process. In the WGAN framework, the 1-Lipschitez function is assumed with weight clipping. However, Ref

[22] points out that the weight clipping may be problematic in WGAN and suggests to replace it with a gradient penalty term, which is implemented in our proposed framework. Hence, the objective function of the proposed WGAN framework is expressed as follows:

(1)

where , , represent a sparse-view sinogram, an image reconstructed by from a sparse-view sinogram and the ground-truth image reconstructed from the full-vew projection data respectively. denotes the expectation of as a function of . , and represent the trainable parameters of , and respectively. represents images between fake (from either or ) and real (from the ground-truth dataset) images. denotes the gradient of with respect to . The parameter balances the Wasserstein distance terms and gradient penalty terms. As suggested in Refs [20, 23, 22], , and are updated iteratively.

The objective function for optimizing the generator networks involves the mean square error (MSE) [16, 24], structural similarity index (SSIM) [25, 26] and adversarial loss [27, 28]. MSE is a popular choice for denoising applications [29], which effectively suppresses the background noise but could result in over-smoothed images [30]. Moreover, MSE is insensitive to image texture since it assumes background noise is white gaussian noise and is independent of local image features. The formula of MSE loss is expressed as follows:

(2)

where , and denote the number of batches, image width and image height respectively. and represent ground-truth image and image reconstructed by generator networks (either or ) respectively.

To compensate for the disadvantages of MSE and acquire visually better images, SSIM is introduced in the objective function. The SSIM formula is expressed as follows:

(3)

where and are constants used to stabilize the formula if the denominator is small. stands for the dynamic range of pixel values and , . , , , and are the mean of and

, variance of

and and the covariance between and respectively. Then, the structural loss is expressed as follows:

(4)

Furthermore, the adversarial loss aims to assist the generators, producing sharp images that are indistinguishable by the discriminator network. Refer to equation 1, adversarial loss for is expressed as follows:

(5)

and adversarial loss for is expressed as follows:

(6)

As mentioned early, solving the few-view CT problem is similar to solving a set of linear equations when the number of equations is not sufficient to perfectly resolve all the unknowns. The intuition of DNA is trying to estimate those unknown as close as possible by combining the information from the existing equations and the knowledge hidden in the big data. The recovered unknowns should satisfy the equations we have. Therefore, MSE between the original sinogram and the synthesized sinogram from a reconstructed image (either

or ) is also included as part of the objective function, which is expressed as follows:

(7)

where , denote the number of batches, number of views and sinogram height respectively. represents original sinogram and represents sinogram from a reconstructed image (either or ).

Both generator networks are updated at the same time. The overall objective function of 2 generators is expressed as follows:

(8)

where the superscripts and indicate that the term is for measurements between ground-truth images and results reconstructed by and respectively. , and are hyper-parameters used to balance different loss functions.

Figure 1: Workflow of the proposed method. Images are example outputs from a 49-view sinogram. The display window is [-160, 240] HU

2.3 Discriminator network

The discriminator network of proposed method takes input from ,

, and the ground-truth dataset, trying to distinguish whether the input is real or fake. In DNA, the discriminator network has 6 convolutional layers with 64, 64, 128, 128, 256, 256 filters and followed by 2 fully-connected layers with number of neurons 1,024 and 1 respectively. The leaky ReLU activation function is used after each layer with a slope of 0.2 in the negative part.

kernel and zero-padding are used for all the convolutional layers, with stride equal 1 for odd layers and 2 for even layers.

3 Experiments and Results

3.1 Experimental design

The dataset is the clinical patient dataset generated and authorized by Mayo Clinic for “the 2016 NIH-AAPM-Mayo Clinic Low Dose CT Grand Challenge[31]. The dataset contains a total of 5,936 abdominal CT images selected by Mayo Clinic with 1 mm slice thickness. Pixel values of patient images were normalized between 0 and 1. In this dataset, 9 patients (5,410 images) are selected for training/validation while 1 patient (526 images) is selected for testing. As mentioned early, DNA was first pre-trained using natural images from ImageNet. During the pre-training segment, a total of 120,000 images were randomly selected from ImageNet, among which 114,000 images were used for training/validation and the remaining 6,000 images were used for testing. Pixel values of ImageNet images were also normalized between 0 and 1. All the pixel values outside a prescribed circle were set to 0. All images were resized into . The Radon transform was used to simulate few-view projection measurements. 39-view and 49-view sinograms were respectively synthesized from angles equally distributed over a full scan range.

A batch size of 10 was selected for training. All experimental code was implemented in the TensorFlow framework

[32] on an NVIDIA Titan Xp GPU. The Adam optimizer [33] was used to optimize the parameters. We compared DNA with 3 state-of-the-art deep learning methods, including LEARN [34], sinogram-synthesis U-net [18], and iCT-Net [9]. To our best knowledge, the network settings are the same as the default settings described in the original papers.

For the LEARN network, the number of iterations was set as 50 and the number of filters for all three layers was set to 48, 48, and 1 respectively. The convolutional kernel was set as . The initial input to the network was the FBP result. The same preprocessed Mayo dataset described above was used to train the LEARN network. Please note that the amount of data we used to train the LEARN network was much larger than that in the original LEARN paper.

For the sinogram-U-net, 720-view sinograms were simulated using the Radon transform. Then, the simulated sinograms were cropped into patches with a stride 10 for training. The FBP method was applied to the reconstructed sinograms for generating final images.

The training process of iCT-Net is divided into two stages. In the first stage, the first 9 layers were pre-trained with projection data. In the second stage, the pre-trained iCT-Net performed end to end training using the patient data. In the original iCT-Net paper, iCT-Net used a total of 58 CT examinations acquired under the same condition for stage 2 training. However, since we do not have their dataset, limited Mayo images might have caused overfitting in stage 2 training when we made efforts to replicate their results. Therefore, for a fair comparison, testing images were included in the training stages. Please note that iCT-Net was handled by 2 NVIDIA Titan Xp GPUs.

3.2 Visual and quantitative assessment

To visualize the performance of different methods, a few representative slices were selected from the testing dataset. Figure 2 shows results reconstructed using different methods from 49-view real patient sinograms. Table 1 shows the number of parameters used for these methods.

LEARN Sino-syn U-net iCT-Net DNA (49-views) DNA (39-views)
No. of parameters 3,004,900 47,118,017 16,933,929 1,962,101 1,844,341
Table 1: The number of parameters used in different methods.
Figure 2: Representative images reconstructed using different methods. The display window is [-160, 240] HU for patient images. (a) The ground-truth, (b) FBP, (c) LEARN, (d) sinogram-synthesis U-net, (e) iCT-Net, (f) DNA. (g)-(l) The zoomed region of the first row marked by the blue boxes. More tiny details were recovered using our proposed method. The red arrows point to some small details that are better reconstructed by DNA.

Three metrics, PSNR, SSIM, and root-mean-square-error (RMSE), are selected for quantitative assessment. Table 2 shows quantitative measurements for different methods, acquired by averaging the metrics over the testing dataset, for both 39-view and 49-view results.

LEARN Sino-syn U-net iCT-Net DNA
SSIM (49-views) red
PSNR (49-views) red
RMSE (49-views) red
SSIM (39-views) N/A red
PSNR (39-views) N/A red
RMSE (39-views) N/A red

Table 2: Quantitative measurements for different methods (). For each metric, the best results are marked in red. We did not test the iCT-Net for the 39-view case. Measurements are acquired by averaging the values in the testing dataset.

3.3 Examination of intermediate results

To demonstrate the effectiveness of two generators in DNA, another network was trained using solely the FBP results as the input. Figure 3 shows typical results reconstructed from 49-view sinograms using various methods.

Figure 3: Representative outputs using different methods. (a) The ground-truth, (b) FBP, (c) trained using only FBP results as the input, (d) , and (e) DNA. The display window is [-160, 240] HU for patient images.

As shown in the first row in Figure 3, streak artifacts cannot be perfectly eliminated by when the network is trained using only the FBP results. , on the other hand, eliminates these artifacts through learning from projection data (first row, red arrows). Moreover, as mentioned early, by the design intends to assist for producing better results. This effect can be observed in the second row of Figure 3. removes artifacts that cannot be removed using (second row, red arrow), but it introduces new artifacts (second row, orange arrow). These artifacts can then be removed by . In summary, helps remove artifacts that could not be removed by processing FBP results, even though it brings up new artifacts, the newly introduced artifacts can be removed by the second generator network. Put differently, the proposed method, DNA, combines the best of two worlds. Quantitative measurements on the outputs reconstructed by various components in DNA are listed in Table 3.

(trained using only FBP results) DNA
SSIM (49-views) red
PSNR (49-views) red
RMSE (49-views) red

Table 3: Quantitative measurements for different components in DNA (). The best results are marked in red. Measurements are acquired by averaging the values in the testing dataset.

3.4 Generalizability analysis

To demonstrate that the proposed method truly learns the backprojection process and can be generalized to other datasets, DNA and LEARN (second best method) were tested directly on female breast CT datasets acquired on a breast CT scanner (Koning corporation). 4,968 CT images, scanned at 49 peak kilovoltage (kVp), were acquired from 12 patients. There is a total of 3 sets of images per patient, reconstructing from 300, 150, 100 projections respectively. Koning images are reconstructed using commercial FBP with additional post-processing. Figure 4 shows representative images reconstructed using different methods. Table 4 gives the corresponding quantitative measurements. Completely dark images in the dataset were excluded, resulting in a total of 4,635 CT images.

DNA demonstrates outstanding performance in terms of generalizability, as shown in Figure 4. Specifically, images reconstructed using LEARN appear over-smoothed, and lose some details. On the other hand, DNA not only reserves more details than LEARN but also suppresses streak artifacts effectively, relative to 150-view and 100-view results. Moreover, images reconstructed by DNA from 49-view sinograms are better than 100-view images in terms of SSIM and RMSE.

Figure 4: Representative outputs using different methods on Koning breast CT datasets. (a) Koning scanner (300-view), (b) Koning scanner (150-view), and (c) Koning scanner (100-view) (d) LEARN (49-view) and (e) DNA (49-view). The display window is [-300, 300] HU.
Koning commercial FBP Koning commercial FBP LEARN (49-view) DNA (49-view)
(150-view) (100-view)
SSIM red blue
PSNR red blue
RMSE red blue

Table 4: Quantitative measurements for Koning breast images reconstructed using different methods (). Measurements were calculated with respect to 300-view results and acquired by averaging the values in the testing dataset. The best and second-best results are marked in red and blue respectively.

Also, we validated DNA and LEARN on the Massachusetts General Hospital (MGH) dataset [35]. MGH dataset contains 40 cadaver scans acquired with representative protocols. Each cadaver was scanned on a GE Discovery HD 750 scanner at 4 different dose levels. Only 10NI (Noise Index) images were used for testing. NI

refers to the standard deviation of CT numbers within a region of interest in a water phantom of a specific size

[36]. Typical images are shown in Figure 5. The corresponding quantitative measurements are in Table 5.

Figure 5: Representative outputs using different methods for the MGH dataset from 49-view sinograms. (a) and (d) Ground-truth, (b) and (e) LEARN, (c) and (f) DNA. The display window is [-300, 300] HU.
LEARN (49-view) DNA (49-view)
SSIM red
PSNR red
RMSE red

Table 5: Quantitative measurements for the MGH dataset (). Measurements were acquired by averaging the values in the testing dataset. The best results are marked in red.

4 Conclusion

Although the field of deep imaging is still at its early stage, remarkable results have been achieved over the past several years. We envision that deep learning will play an important role in the field of tomographic imaging [37]. In this direction, we have developed this novel DNA network for reconstructing CT images directly from sinograms. In this paper, even though the proposed method has only been tested for the few-view CT problem, we believe that it can be applied/adapted to solve various other CT problems, including image de-noising, limited-angle CT, and so on. This paper is not the first work for reconstructing images directly from raw data, but previously proposed methods require a significantly greater amount of GPU memory for training. It is underlined that our proposed method solves the memory issue by learning the reconstruction process with the point-wise fully-connected layer and other proper network ingredients. Also, by passing only a single point into the fully-connected layer, the proposed method can truly learn the backprojection process. In our study, the DNA network demonstrates superior performance and generalizability. In the future works, we will validate the proposed method on images up to dimension or even .

References

  • [1] Wang, G., Ye, Y., and Yu, H., “Approximate and exact cone-beam reconstruction with standard and non-standard spiral scanning,” Phys Med Biol 52, R1–13 (Mar. 2007).
  • [2] Gordon, R., Bender, R., and Herman, G. T., “Algebraic reconstruction techniques (ART) for three-dimensional electron microscopy and x-ray photography.,” Journal of theoretical biology 29(3), 471–481 (1970).
  • [3] Andersen, A., “Simultaneous Algebraic Reconstruction Technique (SART): A superior implementation of the ART algorithm,” Ultrasonic Imaging 6, 81–94 (Jan. 1984).
  • [4] Dempster, A. P., Laird, N. M., and work(s):, D. B. R. R., “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society. Series B (Methodological) 39(1), 1–38 (1977).
  • [5] Wang, G., “A Perspective on Deep Imaging,” IEEE Access 4, 8914–8924 (2016).
  • [6] Wang, G., Butler, A., Yu, H., and Campbell, M., “Guest Editorial Special Issue on Spectral CT,” IEEE Transactions on Medical Imaging 34, 693–696 (Mar. 2015).
  • [7] Wang, G., Ye, J. C., Mueller, K., and Fessler, J. A., “Image Reconstruction is a New Frontier of Machine Learning,” IEEE Trans Med Imaging 37(6), 1289–1296 (2018).
  • [8] Zhu, B., Liu, J. Z., Cauley, S. F., Rosen, B. R., and Rosen, M. S., “Image reconstruction by domain-transform manifold learning,” Nature 555, 487–492 (Mar. 2018).
  • [9] Li, Y., Li, K., Zhang, C., Montoya, J., and Chen, G., “Learning to Reconstruct Computed Tomography (CT) Images Directly from Sinogram Data under A Variety of Data Acquisition Conditions,” IEEE Transactions on Medical Imaging , 1–1 (2019).
  • [10] “ImageNet.”
  • [11] Gribbon, K. T. and Bailey, D. G., “A novel approach to real-time bilinear interpolation,” in [Proceedings. DELTA 2004. Second IEEE International Workshop on Electronic Design, Test and Applications ], 126–131 (Jan. 2004).
  • [12] Ronneberger, O., Fischer, P., and Brox, T., “U-Net: Convolutional Networks for Biomedical Image Segmentation,” arXiv:1505.04597 [cs] (May 2015). arXiv: 1505.04597.
  • [13] Xie, S., Girshick, R., Dollár, P., Tu, Z., and He, K., “Aggregated Residual Transformations for Deep Neural Networks,” arXiv:1611.05431 [cs] (Nov. 2016). arXiv: 1611.05431.
  • [14]

    Shan, H., Zhang, Y., Yang, Q., Kruger, U., Kalra, M. K., Sun, L., Cong, W., and Wang, G., “3-D Convolutional Encoder-Decoder Network for Low-Dose CT via Transfer Learning From a 2-D Trained Network,”

    IEEE Transactions on Medical Imaging 37, 1522–1534 (June 2018).
  • [15] Shan, H., Padole, A., Homayounieh, F., Kruger, U., Khera, R. D., Nitiwarangkul, C., Kalra, M. K., and Wang, G., “Competitive performance of a modularized deep neural network compared to commercial algorithms for low-dose CT image reconstruction,” Nature Machine Intelligence 1, 269–276 (June 2019).
  • [16]

    Chen, H., Zhang, Y., Zhang, W., Liao, P., Li, K., Zhou, J., and Wang, G., “Low-dose CT via convolutional neural network,”

    Biomed Opt Express 8, 679–694 (Jan. 2017).
  • [17] Jin, K. H., McCann, M. T., Froustey, E., and Unser, M., “Deep Convolutional Neural Network for Inverse Problems in Imaging,” IEEE Transactions on Image Processing 26, 4509–4522 (Sept. 2017).
  • [18] Lee, H., Lee, J., Kim, H., Cho, B., and Cho, S., “Deep-Neural-Network-Based Sinogram Synthesis for Sparse-View CT Image Reconstruction,” IEEE Transactions on Radiation and Plasma Medical Sciences 3, 109–119 (Mar. 2019).
  • [19] Quan, T. M., Nguyen-Duc, T., and Jeong, W.-K., “Compressed Sensing MRI Reconstruction using a Generative Adversarial Network with a Cyclic Loss,” IEEE Transactions on Medical Imaging 37, 1488–1497 (June 2018). arXiv: 1709.00753.
  • [20] Arjovsky, M., Chintala, S., and Bottou, L., “Wasserstein Generative Adversarial Networks,” in [International Conference on Machine Learning ], 214–223 (July 2017).
  • [21] Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y., “Generative Adversarial Networks,” arXiv:1406.2661 [cs, stat] (June 2014). arXiv: 1406.2661.
  • [22] Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and Courville, A. C., “Improved Training of Wasserstein GANs,” in [Advances in Neural Information Processing Systems 30 ], Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., eds., 5767–5777, Curran Associates, Inc. (2017).
  • [23] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y., “Generative Adversarial Nets,” in [Advances in Neural Information Processing Systems 27 ], Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. D., and Weinberger, K. Q., eds., 2672–2680, Curran Associates, Inc. (2014).
  • [24] Wolterink, J. M., Leiner, T., Viergever, M. A., and Išgum, I., “Generative Adversarial Networks for Noise Reduction in Low-Dose CT,” IEEE Transactions on Medical Imaging 36, 2536–2545 (Dec. 2017).
  • [25] Zhou Wang, Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P., “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing 13, 600–612 (Apr. 2004).
  • [26] You, C., Yang, Q., shan, H., Gjesteby, L., Li, G., Ju, S., Zhang, Z., Zhao, Z., Zhang, Y., Cong, W., and Wang, G., “Structurally-Sensitive Multi-Scale Deep Neural Network for Low-Dose CT Denoising,” IEEE Access 6, 41839–41855 (2018).
  • [27] Wu, D., Kim, K., Fakhri, G. E., and Li, Q., “A Cascaded Convolutional Neural Network for X-ray Low-dose CT Image Denoising,” arXiv:1705.04267 [cs, stat] (May 2017). arXiv: 1705.04267.
  • [28] Yang, Q., Yan, P., Zhang, Y., Yu, H., Shi, Y., Mou, X., Kalra, M. K., Zhang, Y., Sun, L., and Wang, G., “Low-Dose CT Image Denoising Using a Generative Adversarial Network With Wasserstein Distance and Perceptual Loss,” IEEE Transactions on Medical Imaging 37, 1348–1357 (June 2018).
  • [29] Wang, Z. and Bovik, A. C., “Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures,” IEEE Signal Processing Magazine 26, 98–117 (Jan. 2009).
  • [30] Zhao, H., Gallo, O., Frosio, I., and Kautz, J., “Loss Functions for Image Restoration With Neural Networks,” IEEE Transactions on Computational Imaging 3, 47–57 (Mar. 2017).
  • [31] “Low Dose CT Grand Challenge.”
  • [32] Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viegas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X., “TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems,” 19.
  • [33] Kingma, D. P. and Ba, J., “Adam: A Method for Stochastic Optimization,” arXiv:1412.6980 [cs] (Dec. 2014). arXiv: 1412.6980.
  • [34] Chen, H., Zhang, Y., Chen, Y., Zhang, J., Zhang, W., Sun, H., Lv, Y., Liao, P., Zhou, J., and Wang, G., “LEARN: Learned Experts’ Assessment-Based Reconstruction Network for Sparse-Data CT,” IEEE Transactions on Medical Imaging 37, 1333–1347 (June 2018).
  • [35] Yang, Q., Kalra, M. K., Padole, A., Li, J., Hilliard, E., Lai, R., and Wang, G., “Big Data from CT Scanning,” (2015).
  • [36] McCollough, C. H., Bruesewitz, M. R., and Kofler, J. M., “CT dose reduction and dose management tools: overview of available options,” Radiographics 26, 503–512 (Apr. 2006).
  • [37] Wang, G., Kalra, M., and Orton, C. G., “Machine learning will transform radiology significantly within the next 5 years,” Med Phys 44(6), 2041–2044 (2017).