1 Introduction
Fewview CT is often mentioned in the context of tomographic image reconstruction. Because of the requirement imposed by the Nyquist sampling theorem, reconstructing highquality CT images from undersampled data was considered impossible. When sufficient projection data are acquired, analytical methods such as filtered backprojection (FBP)[1] are widely used for clinical CT image reconstruction. In fewview CT circumstance, severe streak artifacts are introduced in these analytically reconstructed images due to the incompleteness of projection data. To overcome this issue, various iterative techniques were proposed in the past decades, which can incorporate prior knowledge in the image reconstruction. Wellknown methods include algebraic reconstruction technique (ART) [2], simultaneous algebraic reconstruction technique (SART) [3]
, expectation maximization (EM)
[4], etc. Nevertheless, these iterative methods are timeconsuming and still not able to produce satisfying results in many cases. Recently, the development of the graphics processing unit (GPU) technology and the availability of big data allow researchers to train deep neural networks in an acceptable amount of time. Therefore, deep learning has become a new frontier of CT reconstruction research
[5, 6, 7].In the literature, only a few deep learning methods were proposed for reconstructing images directly from raw data. Zhu et al. [8] use fullyconnected layers to learn the mapping from raw kspace data to a corresponding reconstructed MRI image. There is no doubt that fullyconnected layers can be used to learn the mapping from the sinogram domain to the image domain. However, importing the whole sinograms into the network requires a significant amount of memory and posts a major challenge to train the network for a fullsize CT image/volume on a single consumerlevel GPU such as an NVIDIA Titan Xp. A recently proposed method, iCTNet [9] reduces the computational complexity from in [8] to , where and denote the size of a medical image and the number of projections respectively. But one consumerlevel GPU is still not able to handle the iCTNet.
In this study, we propose a dual network architecture (DNA) for CT image reconstruction, which reduces the required parameters from of iCTNet to , where is an adjustable hyperparameter much less than , which can be even set as low as 1. Theoretically, the larger the , the better the performance. The proposed network is trainable on one consumerlevel GPU such as NVIDIA Titan Xp or NVIDIA 1080 Ti. The proposed DNA is inspired by the FBP formulation to learn a refined filtration backprojection process for reconstructing images directly from sinograms. For Xray CT, every single point in the sinogram domain only relates to pixels/voxels on an Xray path through a field of view. With this intuition, the reconstruction process of DNA is learned in a pointwise manner, which is the key ingredient of DNA to alleviate the memory burden. Also, insufficient training dataset is another major issue in deep imaging. Inspired by [8], we first pretrain the network using natural images from the ImageNet [10] and then finetune the model using real patient data. To our best knowledge, this is the first work using ImageNet images to pretrain a medical CT image reconstruction network. In the next section, we present a detailed explanation of our proposed DNA network. In the third section, we describe the experimental design, training data and reconstruction results. Finally, in the last section, we discuss relevant issues and conclude the paper.
2 Methods
This section presents the proposed dual network architecture and the objective function.
2.1 Dual network architecture (DNA)
DNA consists of 2 networks, and . The input to the is a batch of fewview sinograms. According to the Fourier slice theorem, lowfrequency information is sampled denser than highfrequency information. Therefore, if we perform backprojection directly, reconstructed images will become blurry. Ramp filter is usually used to filter projections and to avoid this blurring effect. In DNA, filtration is performed on the sinogram in the Fourier domain through multiplication with the filter length twice the length of the sinogram (which can be shortened). Then, the filtered projections are fed into the first network . tries to learn a filtered backprojection algorithm and output an intermediate image. Then, further optimizes the output from , generating the final image.
can be divided into three components: filtration, backprojection, and refinement. In the filtration part, 1D convolutional layers are used to produce filtered data. Theoretically, the length of the filter is infinitely long for a continuous signal, but it is not practical in reality. Filter length is here set as twice the length of a projection vector (which can be further shortened). Since the filtration is done through a multilayer CNN, different layers can learn different parts of the filter. Therefore, the 1D convolutional window is empirically set as
the length of the projection vector to reduce the computational burden. The idea of residual connections is used to reserve highresolution information and to prevent gradient from vanishing.
Next, the learned sinogram from the filtration part is backprojected by . The backprojection part of
is inspired by the following intuition: every point in the filtered projection vector only relates to pixel values on the xray path through the corresponding object image and any other data points in this vector contribute nothing to the pixels on this xray path. There is no doubt that a single fullyconnected layer can be implemented to learn the mapping from the sinogram domain to the image domain, but its memory requirement becomes an issue due to extremely large matrix multiplications in this layer. To reduce the memory burden, the reconstruction process is learned in a pointwise manner using a pointwise fullyconnected layer. By doing so, DNA can truly learn the backprojection process. The input to the pointwise fullyconnected layer is a single point in the filtered projection vector, and the number of neurons is the width of the corresponding image. After this pointwise fullyconnected layer, rotation and summation operations are applied to simulate the analytical FBP method. Bilinear interpolation
[11] is used for rotating images. Moreover, is empirically set as 23, allowing the network to learn multiple mappings from the sinogram domain to the image domain. The number of can be understood as the number of branches. Note that different viewangle uses different parameters. Although the proposed filtration and backprojection parts all together learn a refined FBP method, streak artifacts cannot be eliminated perfectly. An image reconstructed by the backprojection part is fed into the last portion of for refinement.Refinement part is a typical Unet [12] with conveying paths and is built with the ResNeXt [13] structure. Unet was originally designed for biological images segmentation and had been utilized in various applications. For example, Ref [14, 15, 16] use Unet with conveying paths for CT image denoising, Ref [17, 18] for fewview CT problem and Ref [19]
for compressed sensing MRI, etc. Unet in the DNA contains 4 downsampling and 4 upsampling layers, each has a stride of 2 and is followed by a rectified linear unit (ReLU). A
kernel is used in both convolutional and transposeconvolutional layers. The number of kernels in each layer is 36. To maintain the tensor’s size, zeropadding is used.
uses the same structure as the refinement part in . The input to is a concatenation of FBPresult and output from . With the use of , the network becomes deep. As a result, the benefits of deep learning can be utilized in this direct mapping for CT image reconstruction.
2.2 Objective functions
As shown in Fig. 1, DNA is optimized using the Generative Adversarial Network (GAN) framework [20], which is one of the most advanced framework in the field. In this study, the proposed framework contains three components: 2 generator networks and which are introduced in Subsection 2.1, and a discriminator network . and aim at reconstructing images directly from a batch of fewview sinograms. receives images from , or groundtruth dataset, and intends to distinguish whether an image is real (from the groundtruth dataset) or fake (from either or ). Both networks are able to optimize themselves in the training process. If an optimized network can hardly distinguish fake images from real images, then we say that generators and can fool discriminator which is the goal of GAN. By the design, the network also helps to improve the texture of the final image and prevent oversmoothed issue from occurring.
Different from the vanilla generative adversarial network (GAN) [21]
, Wasserstein GAN (WGAN) replaces the crossentropy loss function with the Wasserstein distance, improving the training stability during the training process. In the WGAN framework, the 1Lipschitez function is assumed with weight clipping. However, Ref
[22] points out that the weight clipping may be problematic in WGAN and suggests to replace it with a gradient penalty term, which is implemented in our proposed framework. Hence, the objective function of the proposed WGAN framework is expressed as follows:(1) 
where , , represent a sparseview sinogram, an image reconstructed by from a sparseview sinogram and the groundtruth image reconstructed from the fullvew projection data respectively. denotes the expectation of as a function of . , and represent the trainable parameters of , and respectively. represents images between fake (from either or ) and real (from the groundtruth dataset) images. denotes the gradient of with respect to . The parameter balances the Wasserstein distance terms and gradient penalty terms. As suggested in Refs [20, 23, 22], , and are updated iteratively.
The objective function for optimizing the generator networks involves the mean square error (MSE) [16, 24], structural similarity index (SSIM) [25, 26] and adversarial loss [27, 28]. MSE is a popular choice for denoising applications [29], which effectively suppresses the background noise but could result in oversmoothed images [30]. Moreover, MSE is insensitive to image texture since it assumes background noise is white gaussian noise and is independent of local image features. The formula of MSE loss is expressed as follows:
(2) 
where , and denote the number of batches, image width and image height respectively. and represent groundtruth image and image reconstructed by generator networks (either or ) respectively.
To compensate for the disadvantages of MSE and acquire visually better images, SSIM is introduced in the objective function. The SSIM formula is expressed as follows:
(3) 
where and are constants used to stabilize the formula if the denominator is small. stands for the dynamic range of pixel values and , . , , , and are the mean of and
, variance of
and and the covariance between and respectively. Then, the structural loss is expressed as follows:(4) 
Furthermore, the adversarial loss aims to assist the generators, producing sharp images that are indistinguishable by the discriminator network. Refer to equation 1, adversarial loss for is expressed as follows:
(5) 
and adversarial loss for is expressed as follows:
(6) 
As mentioned early, solving the fewview CT problem is similar to solving a set of linear equations when the number of equations is not sufficient to perfectly resolve all the unknowns. The intuition of DNA is trying to estimate those unknown as close as possible by combining the information from the existing equations and the knowledge hidden in the big data. The recovered unknowns should satisfy the equations we have. Therefore, MSE between the original sinogram and the synthesized sinogram from a reconstructed image (either
or ) is also included as part of the objective function, which is expressed as follows:(7) 
where , denote the number of batches, number of views and sinogram height respectively. represents original sinogram and represents sinogram from a reconstructed image (either or ).
Both generator networks are updated at the same time. The overall objective function of 2 generators is expressed as follows:
(8) 
where the superscripts and indicate that the term is for measurements between groundtruth images and results reconstructed by and respectively. , and are hyperparameters used to balance different loss functions.
2.3 Discriminator network
The discriminator network of proposed method takes input from ,
, and the groundtruth dataset, trying to distinguish whether the input is real or fake. In DNA, the discriminator network has 6 convolutional layers with 64, 64, 128, 128, 256, 256 filters and followed by 2 fullyconnected layers with number of neurons 1,024 and 1 respectively. The leaky ReLU activation function is used after each layer with a slope of 0.2 in the negative part.
kernel and zeropadding are used for all the convolutional layers, with stride equal 1 for odd layers and 2 for even layers.
3 Experiments and Results
3.1 Experimental design
The dataset is the clinical patient dataset generated and authorized by Mayo Clinic for “the 2016 NIHAAPMMayo Clinic Low Dose CT Grand Challenge” [31]. The dataset contains a total of 5,936 abdominal CT images selected by Mayo Clinic with 1 mm slice thickness. Pixel values of patient images were normalized between 0 and 1. In this dataset, 9 patients (5,410 images) are selected for training/validation while 1 patient (526 images) is selected for testing. As mentioned early, DNA was first pretrained using natural images from ImageNet. During the pretraining segment, a total of 120,000 images were randomly selected from ImageNet, among which 114,000 images were used for training/validation and the remaining 6,000 images were used for testing. Pixel values of ImageNet images were also normalized between 0 and 1. All the pixel values outside a prescribed circle were set to 0. All images were resized into . The Radon transform was used to simulate fewview projection measurements. 39view and 49view sinograms were respectively synthesized from angles equally distributed over a full scan range.
A batch size of 10 was selected for training. All experimental code was implemented in the TensorFlow framework
[32] on an NVIDIA Titan Xp GPU. The Adam optimizer [33] was used to optimize the parameters. We compared DNA with 3 stateoftheart deep learning methods, including LEARN [34], sinogramsynthesis Unet [18], and iCTNet [9]. To our best knowledge, the network settings are the same as the default settings described in the original papers.For the LEARN network, the number of iterations was set as 50 and the number of filters for all three layers was set to 48, 48, and 1 respectively. The convolutional kernel was set as . The initial input to the network was the FBP result. The same preprocessed Mayo dataset described above was used to train the LEARN network. Please note that the amount of data we used to train the LEARN network was much larger than that in the original LEARN paper.
For the sinogramUnet, 720view sinograms were simulated using the Radon transform. Then, the simulated sinograms were cropped into patches with a stride 10 for training. The FBP method was applied to the reconstructed sinograms for generating final images.
The training process of iCTNet is divided into two stages. In the first stage, the first 9 layers were pretrained with projection data. In the second stage, the pretrained iCTNet performed end to end training using the patient data. In the original iCTNet paper, iCTNet used a total of 58 CT examinations acquired under the same condition for stage 2 training. However, since we do not have their dataset, limited Mayo images might have caused overfitting in stage 2 training when we made efforts to replicate their results. Therefore, for a fair comparison, testing images were included in the training stages. Please note that iCTNet was handled by 2 NVIDIA Titan Xp GPUs.
3.2 Visual and quantitative assessment
To visualize the performance of different methods, a few representative slices were selected from the testing dataset. Figure 2 shows results reconstructed using different methods from 49view real patient sinograms. Table 1 shows the number of parameters used for these methods.
LEARN  Sinosyn Unet  iCTNet  DNA (49views)  DNA (39views)  

No. of parameters  3,004,900  47,118,017  16,933,929  1,962,101  1,844,341 
Three metrics, PSNR, SSIM, and rootmeansquareerror (RMSE), are selected for quantitative assessment. Table 2 shows quantitative measurements for different methods, acquired by averaging the metrics over the testing dataset, for both 39view and 49view results.
LEARN  Sinosyn Unet  iCTNet  DNA  
SSIM (49views)  red  
PSNR (49views)  red  
RMSE (49views)  red  
SSIM (39views)  N/A  red  
PSNR (39views)  N/A  red  
RMSE (39views)  N/A  red  

3.3 Examination of intermediate results
To demonstrate the effectiveness of two generators in DNA, another network was trained using solely the FBP results as the input. Figure 3 shows typical results reconstructed from 49view sinograms using various methods.
As shown in the first row in Figure 3, streak artifacts cannot be perfectly eliminated by when the network is trained using only the FBP results. , on the other hand, eliminates these artifacts through learning from projection data (first row, red arrows). Moreover, as mentioned early, by the design intends to assist for producing better results. This effect can be observed in the second row of Figure 3. removes artifacts that cannot be removed using (second row, red arrow), but it introduces new artifacts (second row, orange arrow). These artifacts can then be removed by . In summary, helps remove artifacts that could not be removed by processing FBP results, even though it brings up new artifacts, the newly introduced artifacts can be removed by the second generator network. Put differently, the proposed method, DNA, combines the best of two worlds. Quantitative measurements on the outputs reconstructed by various components in DNA are listed in Table 3.
(trained using only FBP results)  DNA  
SSIM (49views)  red  
PSNR (49views)  red  
RMSE (49views)  red  

3.4 Generalizability analysis
To demonstrate that the proposed method truly learns the backprojection process and can be generalized to other datasets, DNA and LEARN (second best method) were tested directly on female breast CT datasets acquired on a breast CT scanner (Koning corporation). 4,968 CT images, scanned at 49 peak kilovoltage (kVp), were acquired from 12 patients. There is a total of 3 sets of images per patient, reconstructing from 300, 150, 100 projections respectively. Koning images are reconstructed using commercial FBP with additional postprocessing. Figure 4 shows representative images reconstructed using different methods. Table 4 gives the corresponding quantitative measurements. Completely dark images in the dataset were excluded, resulting in a total of 4,635 CT images.
DNA demonstrates outstanding performance in terms of generalizability, as shown in Figure 4. Specifically, images reconstructed using LEARN appear oversmoothed, and lose some details. On the other hand, DNA not only reserves more details than LEARN but also suppresses streak artifacts effectively, relative to 150view and 100view results. Moreover, images reconstructed by DNA from 49view sinograms are better than 100view images in terms of SSIM and RMSE.
Koning commercial FBP  Koning commercial FBP  LEARN (49view)  DNA (49view)  
(150view)  (100view)  
SSIM  red  blue  
PSNR  red  blue  
RMSE  red  blue  

Also, we validated DNA and LEARN on the Massachusetts General Hospital (MGH) dataset [35]. MGH dataset contains 40 cadaver scans acquired with representative protocols. Each cadaver was scanned on a GE Discovery HD 750 scanner at 4 different dose levels. Only 10NI (Noise Index) images were used for testing. NI
refers to the standard deviation of CT numbers within a region of interest in a water phantom of a specific size
[36]. Typical images are shown in Figure 5. The corresponding quantitative measurements are in Table 5.LEARN (49view)  DNA (49view)  
SSIM  red  
PSNR  red  
RMSE  red  

4 Conclusion
Although the field of deep imaging is still at its early stage, remarkable results have been achieved over the past several years. We envision that deep learning will play an important role in the field of tomographic imaging [37]. In this direction, we have developed this novel DNA network for reconstructing CT images directly from sinograms. In this paper, even though the proposed method has only been tested for the fewview CT problem, we believe that it can be applied/adapted to solve various other CT problems, including image denoising, limitedangle CT, and so on. This paper is not the first work for reconstructing images directly from raw data, but previously proposed methods require a significantly greater amount of GPU memory for training. It is underlined that our proposed method solves the memory issue by learning the reconstruction process with the pointwise fullyconnected layer and other proper network ingredients. Also, by passing only a single point into the fullyconnected layer, the proposed method can truly learn the backprojection process. In our study, the DNA network demonstrates superior performance and generalizability. In the future works, we will validate the proposed method on images up to dimension or even .
References
 [1] Wang, G., Ye, Y., and Yu, H., “Approximate and exact conebeam reconstruction with standard and nonstandard spiral scanning,” Phys Med Biol 52, R1–13 (Mar. 2007).
 [2] Gordon, R., Bender, R., and Herman, G. T., “Algebraic reconstruction techniques (ART) for threedimensional electron microscopy and xray photography.,” Journal of theoretical biology 29(3), 471–481 (1970).
 [3] Andersen, A., “Simultaneous Algebraic Reconstruction Technique (SART): A superior implementation of the ART algorithm,” Ultrasonic Imaging 6, 81–94 (Jan. 1984).
 [4] Dempster, A. P., Laird, N. M., and work(s):, D. B. R. R., “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society. Series B (Methodological) 39(1), 1–38 (1977).
 [5] Wang, G., “A Perspective on Deep Imaging,” IEEE Access 4, 8914–8924 (2016).
 [6] Wang, G., Butler, A., Yu, H., and Campbell, M., “Guest Editorial Special Issue on Spectral CT,” IEEE Transactions on Medical Imaging 34, 693–696 (Mar. 2015).
 [7] Wang, G., Ye, J. C., Mueller, K., and Fessler, J. A., “Image Reconstruction is a New Frontier of Machine Learning,” IEEE Trans Med Imaging 37(6), 1289–1296 (2018).
 [8] Zhu, B., Liu, J. Z., Cauley, S. F., Rosen, B. R., and Rosen, M. S., “Image reconstruction by domaintransform manifold learning,” Nature 555, 487–492 (Mar. 2018).
 [9] Li, Y., Li, K., Zhang, C., Montoya, J., and Chen, G., “Learning to Reconstruct Computed Tomography (CT) Images Directly from Sinogram Data under A Variety of Data Acquisition Conditions,” IEEE Transactions on Medical Imaging , 1–1 (2019).
 [10] “ImageNet.”
 [11] Gribbon, K. T. and Bailey, D. G., “A novel approach to realtime bilinear interpolation,” in [Proceedings. DELTA 2004. Second IEEE International Workshop on Electronic Design, Test and Applications ], 126–131 (Jan. 2004).
 [12] Ronneberger, O., Fischer, P., and Brox, T., “UNet: Convolutional Networks for Biomedical Image Segmentation,” arXiv:1505.04597 [cs] (May 2015). arXiv: 1505.04597.
 [13] Xie, S., Girshick, R., Dollár, P., Tu, Z., and He, K., “Aggregated Residual Transformations for Deep Neural Networks,” arXiv:1611.05431 [cs] (Nov. 2016). arXiv: 1611.05431.

[14]
Shan, H., Zhang, Y., Yang, Q., Kruger, U., Kalra, M. K., Sun, L., Cong, W., and Wang, G., “3D Convolutional EncoderDecoder Network for LowDose CT via Transfer Learning From a 2D Trained Network,”
IEEE Transactions on Medical Imaging 37, 1522–1534 (June 2018).  [15] Shan, H., Padole, A., Homayounieh, F., Kruger, U., Khera, R. D., Nitiwarangkul, C., Kalra, M. K., and Wang, G., “Competitive performance of a modularized deep neural network compared to commercial algorithms for lowdose CT image reconstruction,” Nature Machine Intelligence 1, 269–276 (June 2019).

[16]
Chen, H., Zhang, Y., Zhang, W., Liao, P., Li, K., Zhou, J., and Wang, G., “Lowdose CT via convolutional neural network,”
Biomed Opt Express 8, 679–694 (Jan. 2017).  [17] Jin, K. H., McCann, M. T., Froustey, E., and Unser, M., “Deep Convolutional Neural Network for Inverse Problems in Imaging,” IEEE Transactions on Image Processing 26, 4509–4522 (Sept. 2017).
 [18] Lee, H., Lee, J., Kim, H., Cho, B., and Cho, S., “DeepNeuralNetworkBased Sinogram Synthesis for SparseView CT Image Reconstruction,” IEEE Transactions on Radiation and Plasma Medical Sciences 3, 109–119 (Mar. 2019).
 [19] Quan, T. M., NguyenDuc, T., and Jeong, W.K., “Compressed Sensing MRI Reconstruction using a Generative Adversarial Network with a Cyclic Loss,” IEEE Transactions on Medical Imaging 37, 1488–1497 (June 2018). arXiv: 1709.00753.
 [20] Arjovsky, M., Chintala, S., and Bottou, L., “Wasserstein Generative Adversarial Networks,” in [International Conference on Machine Learning ], 214–223 (July 2017).
 [21] Goodfellow, I. J., PougetAbadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S., Courville, A., and Bengio, Y., “Generative Adversarial Networks,” arXiv:1406.2661 [cs, stat] (June 2014). arXiv: 1406.2661.
 [22] Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and Courville, A. C., “Improved Training of Wasserstein GANs,” in [Advances in Neural Information Processing Systems 30 ], Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., eds., 5767–5777, Curran Associates, Inc. (2017).
 [23] Goodfellow, I., PougetAbadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S., Courville, A., and Bengio, Y., “Generative Adversarial Nets,” in [Advances in Neural Information Processing Systems 27 ], Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. D., and Weinberger, K. Q., eds., 2672–2680, Curran Associates, Inc. (2014).
 [24] Wolterink, J. M., Leiner, T., Viergever, M. A., and Išgum, I., “Generative Adversarial Networks for Noise Reduction in LowDose CT,” IEEE Transactions on Medical Imaging 36, 2536–2545 (Dec. 2017).
 [25] Zhou Wang, Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P., “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing 13, 600–612 (Apr. 2004).
 [26] You, C., Yang, Q., shan, H., Gjesteby, L., Li, G., Ju, S., Zhang, Z., Zhao, Z., Zhang, Y., Cong, W., and Wang, G., “StructurallySensitive MultiScale Deep Neural Network for LowDose CT Denoising,” IEEE Access 6, 41839–41855 (2018).
 [27] Wu, D., Kim, K., Fakhri, G. E., and Li, Q., “A Cascaded Convolutional Neural Network for Xray Lowdose CT Image Denoising,” arXiv:1705.04267 [cs, stat] (May 2017). arXiv: 1705.04267.
 [28] Yang, Q., Yan, P., Zhang, Y., Yu, H., Shi, Y., Mou, X., Kalra, M. K., Zhang, Y., Sun, L., and Wang, G., “LowDose CT Image Denoising Using a Generative Adversarial Network With Wasserstein Distance and Perceptual Loss,” IEEE Transactions on Medical Imaging 37, 1348–1357 (June 2018).
 [29] Wang, Z. and Bovik, A. C., “Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures,” IEEE Signal Processing Magazine 26, 98–117 (Jan. 2009).
 [30] Zhao, H., Gallo, O., Frosio, I., and Kautz, J., “Loss Functions for Image Restoration With Neural Networks,” IEEE Transactions on Computational Imaging 3, 47–57 (Mar. 2017).
 [31] “Low Dose CT Grand Challenge.”
 [32] Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viegas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X., “TensorFlow: LargeScale Machine Learning on Heterogeneous Distributed Systems,” 19.
 [33] Kingma, D. P. and Ba, J., “Adam: A Method for Stochastic Optimization,” arXiv:1412.6980 [cs] (Dec. 2014). arXiv: 1412.6980.
 [34] Chen, H., Zhang, Y., Chen, Y., Zhang, J., Zhang, W., Sun, H., Lv, Y., Liao, P., Zhou, J., and Wang, G., “LEARN: Learned Experts’ AssessmentBased Reconstruction Network for SparseData CT,” IEEE Transactions on Medical Imaging 37, 1333–1347 (June 2018).
 [35] Yang, Q., Kalra, M. K., Padole, A., Li, J., Hilliard, E., Lai, R., and Wang, G., “Big Data from CT Scanning,” (2015).
 [36] McCollough, C. H., Bruesewitz, M. R., and Kofler, J. M., “CT dose reduction and dose management tools: overview of available options,” Radiographics 26, 503–512 (Apr. 2006).
 [37] Wang, G., Kalra, M., and Orton, C. G., “Machine learning will transform radiology significantly within the next 5 years,” Med Phys 44(6), 2041–2044 (2017).