Low-Dose CT Image Denoising Using Parallel-Clone Networks

05/14/2020 ∙ by Siqi Li, et al. ∙ University of California-Davis 6

Deep neural networks have a great potential to improve image denoising in low-dose computed tomography (LDCT). Popular ways to increase the network capacity include adding more layers or repeating a modularized clone model in a sequence. In such sequential architectures, the noisy input image and end output image are commonly used only once in the training model, which however limits the overall learning performance. In this paper, we propose a parallel-clone neural network method that utilizes a modularized network model and exploits the benefit of parallel input, parallel-output loss, and clone-toclone feature transfer. The proposed model keeps a similar or less number of unknown network weights as compared to conventional models but can accelerate the learning process significantly. The method was evaluated using the Mayo LDCT dataset and compared with existing deep learning models. The results show that the use of parallel input, parallel-output loss, and clone-to-clone feature transfer all can contribute to an accelerated convergence of deep learning and lead to improved image quality in testing. The parallel-clone network has been demonstrated promising for LDCT image denoising.



There are no comments yet.


page 2

page 5

page 7

page 10

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

X-ray computed tomography (CT) involves radiation exposure that may potentially increase the risk of genetic, cancerous, and other diseases in a patient [1]. To reduce these risks, low-dose CT (LDCT) imaging has become an attractive solution. However, LDCT with standard image reconstruction commonly results in high noise in the reconstructed images, which compromises the diagnostic performance. It is desirable to develop more advanced image processing methods to suppress the noise and improve the image quality in LDCT.

In general, there are two categories of image processing methods for improving LDCT. The first category is tomographic image reconstruction from projection data, including analytical reconstruction in combination with sinogram denoising (e.g., [2, 3, 4]), model-based iterative reconstruction (e.g., [5, 6, 7, 8, 9]), and deep learning based reconstruction (e.g., [14, 11, 12, 10, 13]). These methods have the advantage of exploiting the raw projection data more extensively but have the disadvantage that the access to raw CT projection data remains a resource barrier to many research groups. In contrast, post-reconstruction image denoising (e.g., [15, 16, 17]), the other category, directly deals with the reconstructed images, and are more widely accessible to the research community. A product, once developed, can also be more easily integrated into an existing clinical CT workflow.

Fig. 1: Conceptual illustration of (a) the sequential-clone network and (b) the proposed parallel-clone network. The sequential-clone model can be considered as a special example of the parallel-clone model if the LDCT input image is only fed to the first clone and only the output of the last clone is used in the training.

It has been demonstrated that deep-learning (DL) image denoising has a strong potential to improve LDCT [22, 21, 23, 19, 18, 24, 20]. The image quality by deep-learning image denoising can be equivalent to or even better than the state-of-the-art iterative CT reconstruction [24, 25]. A deep-learning model directly learns the end-to-end relationship between a noisy image and its clean reference image using, for example, deep neural networks (e.g., [21, 19, 18, 20]). Most existing network architectures improve the capacity of the neural network by adding more network layers. However, increased number of layers does not always improve the learning performance in practice[18]. A recent work of Shan et al. [24] demonstrated an alternative solution that uses a modularized adaptive processing neural network (MAP-NN). Instead of adding more new layers, MAP-NN repeatedly adds the same network module with shared weights (hence like “clones”) to increase the network depth and has demonstrated improved image quality for LDCT denoising.

All the aforementioned deep learning methods for LDCT have a sequential-type layout as illustrated in Figure 1(a). From this perspective, the MAP-NN model is a sequential-clone network which uses multiple clones of the basic network module in a sequence of depth. Other earlier network models for LDCT denoising such as the residual encoder-decoder convolutional neural network (RED-CNN)

[18] can be considered as a special case of the sequential-clone networks, in which only one clone of the basic network module (i.e., RED-CNN itself) is used. Increasing the number of clones from 1 deepens the network and has the potential to improve the training.

Nevertheless, such sequential-clone architectures have two major limitations. First, the ability of forward propagating the raw image information is limited. The original noisy input image is used only once, i.e., in the first clone. As analyzed later in this paper, such a usage is very different from conventional model-based image denoising algorithms and can be less effective to propagate the useful information of the noisy input image into later clones. Second, the network structure is also inefficient for back propagating the gradient of the loss function to earlier clones. The loss layer only utilizes the output image of the last clone and is far away from the earlier clones, which makes it difficult to back-propagate the gradient information of the loss to the earlier clones without causing a vanishing gradient. As a result of these two limitations, the overall learning performance of the sequential-clone model for deep-learning image denoising can have been compromised.

In this paper, we propose a parallel-clone network method to overcome the limitations. The new model exploits a parallel use of the noisy input image for the clones and incorporates the output image of all clones into the training loss function, also in parallel. The use of parallel input ensures an efficient forward propagation of useful information of the noisy input image into all the clones. The use of parallel output leads to an efficient back-propagation of the gradient of the loss function to the earlier clones. In addition, the proposed method also explores high-level feature transfer for the communications between the clones. The proposed parallel-clone model is expected to bring substantial improvements over the existing sequential-clone framework for LDCT image denoising.

This paper is organized as follows. Section II introduces the backgroud materials that led to the development of the proposed method. Section III describes the detail of the proposed parallel-clone network model. Results of the training and testing on the Mayo CT Dataset are given in Section IV. Finally conclusions are drawn in Section V.

Ii Background

Ii-a Model-based Image Denoising

The forward model for traditional image denoising methods can be expressed as


where and

denote the noisy CT image and the corresponding clean image to be estimated, respectively.

represents a degradation matrix and is equal to the identity matrix in this image denoising work.

represents the additive noise.

The commonly used least-square image denoising problem is formulated as


where the model consists of two components. The first is a data fidelity term and the second is a regularization item for exploiting the prior information with the regularization parameter. Iterative algorithms are commonly used to solve the optimization problem.

Ii-B End-to-End Deep Learning

A learned model using neural networks predicts a denoised image from the noisy image using


where denotes the end-to-end mapping from to with the neural network weights to be trained from available data sets. A high-quality reference image available from the training dataset may be equivalently expressed as


where accounts for the difference between the prediction and the truth . The mean squared error (MSE) between them is then defined by


In LDCT image denoising, corresponds to the normal-dose image and corresponds to the low-dose image . The training problem for deep-learning image denoising is then formulated as the following optimization if the MSE is used as the loss function:


where denotes the th image pair of low dose and normal dose in the training dataset. is the total number of training pairs. Once the model parameter set is trained, the final image estimate predicted from a noisy low-dose image is obtained using .

Examples of the neural network models for LDCT image denoising includes the RED-CNN [18], wavelet residual network [21], and so on.

Ii-C Sequential-Clone Neural Network

To increase the depth of a neural network model, more layers can be added but with an increasing number of unknown model parameters. An alternative is to repeatedly use a network module. This concept has been explored in general deep-learning models such as the ResNet [26] and unrolled deep learning for image reconstruction (e.g., [11, 27]). A recent development of this concept for LDCT denoising is the modularized adaptive processing neural network (MAP-NN) [24] illustrated in Figure 1(a). Mathematically the MAP-NN is expressed as


where denotes the repeated use of the modularized network for times:


which is equivalent to a sequence of “clones”:


Each clone is an individual denoiser but shares the same model structure and same parameters with other clones. The denoised image of a clone is the input of the next clone. If MSE is used, then the training loss for the sequential-clone network has the form of


If only one clone is used, i.e., , the MAP-NN is then the same as traditional deep neural network models.

Iii Proposed Parallel-Clone Network

Below we first describe the critical components that can be applied to the sequential-clone model individually. We then assemble them to form the proposed parallel-clone model as shown in Figure 1(b).

Iii-a Use of Coupled Input

Residual mapping is popular in deep learning after the ResNet work [26]. The sequential model in Eq. (9) can be equivalently rewritten as the following residual mapping format,


where denotes the residual mapping between the two adjacent clones and is mathematically equivalent to [24] . Note that the noisy input image appears only once in the sequence, i.e., in the first clone () but not in any subsequent clones.

In comparison, conventional model-based image denoising commonly employs an optimization algorithm with the following iterative update:


where denotes the image estimate at iteration . represents the residual update determined by a specific algorithm. For example, the gradient descent algorithm for the least-square image denoising has the form of


where is the step size and is the gradient of the regularization term. Another example is the alternating direction method of multipliers (ADMM) [28], by which the iterative update can be described as


where and represent the updating matrices.

One common feature of these model-based iterative denoising algorithms is that the previous iterate and the noisy input image are coupled together to update the image estimate at next iteration . This coupled input is originated from the data fidelity term in the objective function Eq. (2). From the Bayesian perspective, the data fidelity carries useful information of the statistical distribution of measurements. We hypothesize the coupled use of and is more beneficial than using alone.

Considering each clone in the sequential model as an unrolled “iteration”, we can modify the sequential-clone model by including the noisy input image into each clone,


Compared with Eq. (11), here we use a different notation to denote the residual mapping now taking two inputs (i.e., and ) without significantly changing the structure of the modularized neural network . The inputs and can be coupled using a concatenation operation ,


which are then passed into the subsequent convolutional layers in the model . We expect this modification inspired from model-based image denoising can improve the residual mapping of each clone.

Iii-B Auxiliary Output Loss

Compared to conventional model-based image denoising, deep learning has the advantage of end-to-end training. In the sequential-clone model, this is reflected in the training loss by comparing the output image of the last clone to the reference image, i.e., the normal-dose image in LDCT. Taking the MSE as an example, the training loss function for the sequential-clone network model is equivalent to


which only takes into account the output of the last clone .

A general challenge for deep learning is the vanishing gradient problem

[29], which holds true for the sequential-clone model. In a gradient-based algorithm, efficient back propagation of the gradient to the earlier clones in the sequence is challenging because the earlier clones are further away from the final layer of loss function than the later clones.

We note that in the sequential-clone network model, not only the last clone but also each of the earlier clones produces an auxiliary output image, which has not been utilized by the training process. In fact, auxiliary output has been utilized in previous works for the task of image recognition, e.g., GoogLeNet [30]. Hence we propose to incorporate all the auxiliary output images into the loss function for the sequential-clone model. The MSE training loss is then of the form


in which the auxiliary output image of each clone contributes to the training loss in a parallel way. It becomes more straightforward to back-propagate the gradient information of the loss function from the output layer to the earlier clones. We expect the use of parallel auxiliary output loss can reduce the impact of the gradient vanishing problem.

Iii-C Brute-Force Residual Mapping

The residual mapping function used in the sequential models mainly accounts for the difference between two adjacent clones,


which leads to the following form for the last clone,


Substituting the above expression of into the conventional loss function in Eq. (17), we have


which indicates that the total-residual image is approximated by a sum of the residual images from all the clones. We call in this case incremental residual mapping.

If the parallel auxiliary output loss in Eq. (18) is used, then the training loss becomes to


in which still represents an incremental residual mapping, though the accumulation is different in different clones.

Instead of using the mapping to represent the difference between two adjacent clones, i.e., , we employ a different residual mapping model in this work,


where the residual mapping is changed to reflect the difference between the output image of clone and the noisy input image . Substituting the new expression into the parallel auxiliary output loss, we obtain


which indicates becomes to directly predict the total-residual image in each clone . To be differentiated from the incremental residual mapping model, we call the new model as brute-force residual mapping in this paper.

In this work, we combine the brute-force residual mapping with the coupled input model to explore the benefit of parallel input as a part of the parallel-clone network model.

Fig. 2:

The architecture of a clone in the proposed parallel-clone network. It consists of a low-level feature extraction layer, a multilayer CNN module, and an image recovery module with residual mapping. The high-level feature set from an earlier clone is transferred into the next clone and combined with the low-level feature to form the input of the CNN module.

Iii-D Clone-to-Clone Feature Transfer

In the sequential-clone model in figure 1(a), adjacent clones (e.g., and ) are connected using the intermediate denoised image. Other than the output image , additional high-level features also exist from the clone and can be transferred to clone . We hypothesize the transfer of intermediate high-level feature can be more useful than just transferring the output image. Thus we use an a more general expression for the model of clone :


where , the transferred information from clone , can be the output image or a high-level feature set.

In order to jointly use and if the latter represents high-level features, the clone model first extracts a feature set from using


where denotes a low-level feature extraction operation and is composed of a convolutional layer

and a rectified linear unit (ReLU) layer

in this work. The extracted feature set matches the dimension of but is more focused on the low-level information of the input image, such as edge and corner [31]. and can be then concatenated,


to form the input for the subsequent convolutional layers.

Iii-E Combined Parallel-Clone Network Model

Combing all the aforementioned components together, we obtain a general expression for the proposed parallel-clone network model,


The objective function for the corresponding optimization problem is then defined by


A graphical illustration of the parallel-clone model is provided in Fig. 1(b).

This parallel-clone network model has three unique features: (1) parallel input, (2) parallel-output loss, and (3) clone-to-clone feature transfer. The parallel input feeds the noisy input image

to all the clones in parallel to enable the brute-force residual mapping and the use of coupled input, both improving the forward propagation of image information to the loss layer. The parallel-output loss incorporates all the auxiliary outputs into the training loss, which can improve the backpropagation of the gradient for the earlier clones. The clone-to-clone feature transfer connects adjacent clones to allow deeper learning.

The parallel-clone model is equivalent to the sequential-clone model if the input image is only fed to the first clone and only the output of the last clone is used in the training.

Iii-F Example Architecture and Implementation

An example of the specific architecture of the proposed parallel-clone network is shown in figure 2. The model consists of three modules in each clone: a low-level representation extraction module to obtain the low-level feature set , a high-level representation learning module to obtain the high-level feature , and an image recovery module to get the output image . Different clones have the same model structure with shared weights.

The image recovery module is implemented using a deconvolution layer

, a residual connection, and a



where outputs the residual image from the high-level feature set .

The high-level representation learning for clone is implemented by a multi-layer convolutional neural net (CNN),


where its input is a concatenation of the low-level feature and the high-level feature transferred from the clone , see Eq. (27).

Fig. 3: Example of basic CNN module that can be used for the clone network models. (a) RED-CNN [18], (b) CPCE [24].

In theory, any deep-learning model can be used as the CNN module for the clone networks. Fig. 3 shows two examples adapted from the RED-CNN model [18] and the CPCE model [24], both following an encoder-decoder architecture. Details of the models are referred to the original papers of these models. The encoders consist of a series of followed by to suppress image noise and artifacts from low-level to high-level step by step. In the decoders, a series of are used with residual mapping to recover the structural details. As and are symmetric in the models, the number of is the same as the number of . The deconvolution in combination with symmetric shortcut connections are beneficial for detail preservation [18].

Fig. 4:  Plots of the RMSE convergence curve of the training data for different models based on (a) clone-to-clone image transfer and (b) clone-to-clone feature transfer. PO, CI, FT, and PI denote parallel input, coupled input, feature transfer, and parallel output, respectively. The metrics were evaluated based on the training image patches.
Fig. 5:  Plots of the RMSE convergence of the testing image data for different models based on (a) clone-to-clone image transfer and (b) clone-to-clone feature transfer. The abbreviations are the same with Fig. 4.

Iv Experiments and Results

Iv-a Clinical CT Dataset and Implementation

The 2016 NIH-AAPM-Mayo Clinic Low Dose CT Grand Challenge dataset [32] was used for evaluating the proposed parallel-clone network and other models. The dataset includes the normal-dose abdominal CT scans and synthetic quarter-dose CT scans of ten patients. Each scan consists of about 210 to 340 transverse image slices, each with a matrix size of 512512 pixels. Nine out of ten were randomly selected and used for training and the remaining one was used for testing. The process was repeated for three times.

For each training, ten image slices were randomly selected from each of the nine patients to generate image patches of size 5555 with an interval of 4 pixels. The resulting total number of image patches used for training was about 1.6 million. For testing, the trained model was directly applied to the full image slices from the testing patient scan.

Fig. 6:  Comparison of different models for denoising a testing image. (a) Norma-dose CT, (b) LDCT, (c) SCN, (d) PCN with parallel-out loss only, (e) PCN with parallel-input only, and (f) PCN with all components integrated.

We used the Adam optimization method [33]

to train different network models with a mini-batch of 128 patches in each iteration. Sixty epochs were run. The initial learning rate was set to

and slowly decreased to . The number of convolutional kernels in each layer was 48 except for the last layer, which has only one layer. The kernel size of all layers was set to 3

3 with a convolutional stride of 1 and no padding. All the networks were implemented using PyTorch on a PC with an Intel i9-9920X CPU with 64GB RAM and a NVIDIA GeForce RTX 2080 Ti GPU.

Iv-B Approach of Comparison

We first conducted an ablation study to demonstrate the improvement from the use of parallel input, parallel-out loss, and clone-to-clone feature transfer by using the sequential-clone network (SCN) model as the baseline. The RED-CNN model was used as the basic CNN module for both the SCN model and parallel-clone network (PCN) model. The output image of the last clone was used as the final output of each model unless specified otherwise. We also investigated the effect of hyperparameters in the PCN model, such as the number of layers for the CNN module and the number of clones.

The PCN model was further compared with three popular DL-based denoising methods: the denoising convolutional neural network (DnCNN) [34], RED-CNN [18] and MAP-NN [24]. DnCNN is one of the representative DL models for general image denoising. RED-CNN is a typical example specifically developed for LDCT denoising. The MAP-NN reflects the most recent example of a sequential-clone model for LDCT image denoising.

Note that the DnCNN, RED-CNN, and proposed PCN models were trained using the MSE-based loss. While the original MAP-NN was trained using an advanced loss function that combines the basic MSE loss with perceptual losses, we also trained another MAP-NN using the MSE loss only.

Iv-C Evaluation Metrics

Three common image quality metrics, including the root mean square error (RMSE), peak signal to noise ratio (PSNR), and structural similarity index measure (SSIM)

[14], were used to assess the training convergence and testing image quality.


where and are the normal-dose image and predicted low-dose image with the total number of pixels in the region for quality evaluation. and

denote the mean value and standard deviation inside a sliding window.

respresents the covariance between and . is the maximum pixel value. and are two SSIM parameters defined as and , respectively.

Iv-D Comparison Between Sequential and Parallel Models

Fig. 4(a) shows the plots of the training convergence of RMSE as a function of epoch number for different clone models based on the clone-to-clone image transfer. The RMSE was calculated based on the training image patches and averaged over the the 3 times experiments. For each DL model, the number of clones was four and the number of layers in the basic RED-CNN module was ten. Compared with SCN, the use of parallel output (PO) and coupled input (CI) incrementally improved the RMSE. Further combination with the brute-force residual mapping model, which leads to the full PCN, achieved a significant acceleration of the training convergence as compared to the SCN model.

Fig. 4(b) further shows the comparison based on the clone-to-clone feature transfer (FT). Replacing the image transfer with FT can improve the RMSE, though not for earlier epochs. On top of that, the use of parallel-output loss further improved the convergence. An even larger improvement was obtained with the use of parallel input, which includes both the brute-force residual mapping model and coupled input. The most significant improvement came from the full PCN model which integrates all the three critical modifications (FT, PO, and PI) in the model. The convergence rate of the PCN was dramatically faster than the SCN. The PCN only took about 5 epochs to reach a similar RMSE value as the SCN at 60 epochs, suggesting a speed-up factor of about ten.

Fig. 5 shows the results from the evaluation on the testing data. Note that here the image quality was evaluated on the whole image slices. The relationships between different models are consistent with the results of training convergence shown in Fig. 4.

Fig. 7:  Clone-wise comparison of PSNR between the SCN and different PCN models with four clones for denoising a testing image.

Fig. 6 shows the denoised results of a specific testing image by the SCN model and PCN models with 60 epochs. All the PCN models were implemented with the clone-to-clone feature transfer. For better display, the region of a liver metastasis was magnified in each image. Compared to the normal-dose reference image, the SCN reduced the noise but suffered from artifacts. The use of parallel-output loss or parallel input alone improved the image denoising according to quantitative PSNR. The full PCN model which integrates the parallel input, parallel-output loss, and clone-to-clone feature transfer together achieved the best result in terms of quantitative PSNR and visual quality. These results are further confirmed in Fig. 7 which shows a clone-wise comparison of PSNR for the SCN and different PCN models that consist of four clones.

Table I summarizes the results of different quality metrics (PSNR, SSIM, and RMSE) from the testing dataset. The mean and standard deviation (SD) were shown for each metric. This comparison further confirmed the improvement of the full PCN model and its individual components (parallel-output loss, parallel input, and clone-to-clone feature transfer) as compared to the baseline SCN model.

SCN 43.58241.1476 0.96240.0054 0.00690.0011
PCN with PO 44.01791.2873 0.97050.0052 0.00610.0007
PCN with PI 44.02681.3675 0.97180.0056 0.00600.0008
Full PCN 45.42351.1394 0.98460.0047 0.00540.0006
TABLE I:  Comparison of image PSNR (meanSD) for SCN and PCN with different options.

Iv-E Effect of Clone Settings

Fig. 8(a) shows the effect of the number of CNN layers on the image PSNR of the PCN model applied on the testing dataset. The number of clones was fixed at 4. The result suggests a 10-layer RED-CNN can work well for the PCN model. Adding more layers did not improve the performance significantly. The result is consistent with [18].

Fig. 8(b) shows the effect of the number of clones on the PCN model performance. The PSNR of the output image of each clone was plotted versus the clone index in each PCN model. The total number of clones was varied from 1 to 5. The number of CNN layers was set to 10 based on the result from Fig. 8(a). The curves show that the PSNR of the earlier clones in a PCN model was improved as the number of clones increased. The PSNR of the last clone in each PCN reached its maximum when the number of clones was 3. The differences in peak PSNR were minor among the PCN models with 3 clones, 4 clones, and 5 clones.

Table II compares the choices of the basic CNN module for quantitative image quality evaluation of the PCN. Two basic models were compared, including CPCE [24] and RED-CNN [18]. The number of CNN layers was set to 10 in each comparing model and the number of clones was set to 4. The result suggests the RED-CNN was better than CPCE to serve as the basic CNN module for the parallel-clone network.

Fig. 8: Effect of (a) number of CNN layers and (b) number of clones on the testing PSNR performance of the PCN model.

Iv-F Comparison with Other DL Models

Table III summarizes the results of PSNR, SSIM, and RMSE for denoising the testing dataset using five different models: the proposed PCN, the original DnCNN, RED-CNN, and MAP-CNN. The PCN was implemented with 3 clones. The result of MAP-NN trained using the MSE loss was also included as MAP-NN in the study. Among the different models, the proposed PCN achieved the best quality as assessed by all the three metrics. Compared to DnCNN and RED-CNN which are equivalent to a PCN with single clone, the use of multiple clones in the proposed PCN led to a significant improvement. The comparison of PCN with the sequential-clone models (MAP-NN and MAP-NN) indicates the parallel structure of PCN can be superior to the sequential structure. Note that the improvement of MAP-NN over MAP-NN was mainly from the use of a more advanced loss function in the former. This suggests that a combination of advanced loss functions with PCN would be able to further improve the performance of the PCN model, which will be explored in our future work.

CPCE 44.63651.1332 0.97820.0050 0.00560.0007
RED-CNN 45.42351.1394 0.98460.0047 0.00540.0006
TABLE II:  Comparison of PCN with different types of the basic CNN module.
DnCNN [34] 44.13051.2569 0.97390.0057 0.00570.0007
RED-CNN [18] 44.52381.1924 0.97560.0055 0.00550.0006
MAP-NN [24] 45.46121.3908 0.98290.0054 0.00540.0007
MAP-NN 43.36981.1724 0.96120.0055 0.00700.0011
Proposed (PCN) 45.77751.1057 0.98550.0045 0.00530.0006
TABLE III:  Comparison of the proposed PCN model with other DL models for LDCT image denoising.

Different DL models have different model complexities. Fig. 9 shows the PSNR achieved by each DL model versus the number of trainable parameters) in the model. In addition to the use of 48 convolutional kernels, we also include the result for the use of 64, 80, and 96 kernels in the PCN. The result indicates the increased number of kernels has a minimal effect on PSNR once it exceeds 48. The baseline SCN was composed of four RED-CNN clones, but its performance was worse than the original RED-CNN [18], mainly because the former used a smaller kernel size and a much less number of kernels. Compared to the MAP-NN, the use of advanced loss functions in the MAP-NN improved PSNR but also largely increased the model complexity. In comparison, the PCN achieved better PSNR performance with fewer parameters.

Fig. 10 shows the denoised images obtained by different models. The DnCNN and RED-CNN generally oversmoothed the liver background. The MAP-NN had a closer image appearance to to the normal-dose CT reference image, but some details were lost or with lower contrast, as pointed by the arrows. In comparison, the PCN provided generally higher image quality and better visual quality.

These results together indicate the proposed PCN model can outperform existing DL models for LDCT image denoising.

V Conclusion

In this paper, we have developed a simple yet efficient parallel-clone network architecture for LDCT image denoising. The model uses modularized clones with shared weights and exploits the benefits of parallel input, parallel-out loss, and clone-to-clone feature transfer. It has a similar or less number of unknown parameters as compared to conventional deep learning models but can significantly improve the learning process. Experimental results using the Mayo LDCT dataset demonstrated the improvement of the proposed parallel-clone network model over conventional sequential models.


The authors thank Dr. Cynthia H. McCollough and the Mayo team for providing the LDCT dataset used in this study.

Fig. 9:  Comparison of the PSNR and model complexity of different models.
Fig. 10: Comparison of the images denoised by different deep-learning models. (a) Normal-dose CT, (b) LDCT, (c) DnCNN, (d) RED-CNN, (e) MAP-NN. (f) Proposed PCN.


  • [1] D. J. Brenner and E. J. Hall, “Computed tomography - an increasing source of radiation exposure,” New England Journal of Medicine, vol. 357, no. 22, pp. 2277-2284, Nov. 2007.
  • [2] P. J. La Rivière, “Penalized‐likelihood sinogram smoothing for low‐dose CT,”Medical Physics, vol. 32, no. 6, pp. 1676-1683, 2005.
  • [3] J. Wang, T. Li, H. Lu, and Z. Liang, “Penalized weighted least-squares approach to sinogram noise reduction and image reconstruction for low-dose X-ray computed tomography,” IEEE Trans. Med. Imag., vol. 25, no. 10, pp. 1272-1283, Oct. 2006.
  • [4] A. Manduca, L. Yu, J. D. Trzasko, N. Khaylova, J. M. Kofler, C. M. McCollough, and J. G. Fletcher, “Projection space denoising with bilateral filtering and CT noise modeling for dose reduction in CT,”Medical Physics, vol. 36, no. 11, pp. 4911-4919, Oct. 2009.
  • [5] JB Thibault, KD Sauer, CA Bouman, J Hsieh, “A three-dimensional statistical approach to improved image quality for multislice helical CT,” Medical Physics, vol. 34, no. 11, pp. 4526-4544, 2007.
  • [6] IA Elbakri, JA Fessler, “Statistical Image Reconstruction for Polyenergetic X-Ray Computed Tomography,” IEEE Transactions on Medical Imaging, vol. 21, no. 2, pp. 89-99, 2002.
  • [7] E. Y. Sidky and X. Pan, “Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization,” Phys. Med. Biol., vol. 53, no. 17, pp. 4777-4807, 2008.
  • [8] Q. Xu, H. Yu, X. Mou, L. Zhang, J. Hsieh, and G. Wang, “Low-dose X-ray CT reconstruction via dictionary learning,” IEEE Trans. Med. Imaging, vol. 31, no. 9, pp. 1682-1697, Sep. 2012.
  • [9] S. Ye, S. Ravishankar, Y. Long, and J. A. Fessler, “SPULTRA: Low-dose CT image reconstruction with joint statistical and learned image models,” IEEE Transactions on Medical Imaging, vol. 39, no. 3, pp. 729-741, Mar. 2020.
  • [10] D. Wu, K. Kim, G. El Fakhri, and Q. Li, “Iterative low-dose CT reconstruction with priors trained by artificial neural network,” IEEE Transactions on Medical Imaging, vol. 36, no. 12, pp. 2479-2486, Dec. 2017.
  • [11] H. Chen, Y. Zhang, Y. Chen, J. Zhang, W. Zhang, H. Sun, Y. Lv, P. Liao, J. Zhou, and G. Wang, “LEARN: Learned experts’ assessment-based reconstruction network for sparse-data CT,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1333-1347, Jun. 2018.
  • [12] H. Gupta, K. H. Jin, H. Q. Nguyen, M. T. McCann, and M. Unser, “CNN-based projected gradient descent for consistent CT image reconstruction,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1440-1453, Jun. 2018.
  • [13] P. Bao, H. Sun, Z. Wang, Y. Zhang, W. Xia, K. Yang, W. Chen, M. Chen, Y. Xi, S. Niu, J. Zhou, and H. Zhang, “Convolutional sparse coding for compressed sensing CT reconstruction,” IEEE Transactions on Medical Imaging, vol. 38, no. 11, pp. 2607-2619, Nov. 2019.
  • [14] Y. Li, K. Li, C. Zhang, J. Montoya, and G. H. Chen, “Learning to reconstruct computed tomography images directly from sinogram data under a variety of data acquisition conditions,” IEEE Transactions on Medical Imaging, vol. 38, no. 10, pp. 2469-2481, Oct. 2019.
  • [15] P. F. Feruglio, C. Vinegoni, J. Gros, A. Sbarbati, and R. Weissleder, “Block matching 3D random noise filtering for absorption optical projection tomography,” Phys. Med. Biol., vol. 55, no. 18, pp. 5401-5415, 2010.
  • [16] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. Signal Process., vol. 54, no. 11, pp. 4311-4322, Nov. 2006.
  • [17] Z. Li, L. Yu, J. D. Trzasko, D. S. Lake, D. J. Blezek, J. G. Fletcher, C. H. McCollough, and A. Manduca, “Adaptive nonlocal means filtering based on local noise level for CT denoising,” Medical Physics, vol. 41, no. 1, pp. 011908-1-011908-16, Dec. 2013.
  • [18] H. Chen, Y. Zhang, M. K. Kalra, F. Lin, Y. Chen, P. Liao, J. Zhou, and G. Wang, “Low-Dose CT with a residual encoder-decoder convolutional neural network,” IEEE Transactions on Medical Imaging, vol. 36, no. 12, pp. 2524-2535, Dec. 2017.
  • [19] J. M. Wolterink, T. Leiner, M. A. Viergever, and I. Isgum, “Generative adversarial networks for noise reduction in low-dose CT,” IEEE Transactions on Medical Imaging, vol. 36, no. 12, pp. 2536-2545, Dec. 2017.
  • [20] Q. Yang, P. Yan, Y. Zhang, H. Yu, Y. Shi, X. Mou, M. K. Kalra, Y. Zhang, L. Sun, and G. Wang, “Low-dose CT image denoising using a generative adversarial network with wasserstein distance and perceptual loss,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1348-1357, Jun. 2018.
  • [21] E. Kang, W. Chang, J. Yoo, and J. C. Ye, “Deep convolutional framelet denosing for low-dose CT via wavelet residual network,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1358-1369, Jun. 2018.
  • [22]

    H. Shan, Y. Zhang, Q. Yang, U. Kruger, M. K. Kalra, L. Sun, W. Cong, and G. Wang, “3-D convolutional encoder-decoder network for low-dose CT via transfer learning from a 2-D trained network,”

    IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1522-1534, Jun. 2018.
  • [23] L. Huang, H. Jiang, S. Li, Z. Bai, and J. Zhang, “Two stage residual CNN for texture denoising and structure enhancement on low dose CT image,” Computer Methods and Programs in Biomedicine, vol. 184, p. 105-115, Feb. 2020.
  • [24] H. Shan, A. Padole, F. Homayounieh, U. Kruger, R. D. Khera, C. Nitiwarangkul, M. K. Kalra, and G. Wang, “Competitive performance of a modularized deep neural network compared to commercial algorithms for low-dose CT image reconstruction,” Nature Machine Intelligence, vol. 1, no. 6, pp. 269-276, Jun. 2019.
  • [25] Y. Nakamura, T. Higaki, F. Tatsugami, J. Zhou, Z. Yu, N. Akino, Y. Ito, M. Iida, and K. Awai, “Deep learning–based CT image reconstruction: Initial evaluation targeting hypovascular hepatic metastases,”

    Radiology: Artificial Intelligence

    , vol. 1, no. 6, p. e180011, Oct. 2019.
  • [26] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”

    2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    , Jun. 2016.
  • [27] H. K. Aggarwal, M. P. Mani, and M. Jacob, “MoDL: Model-based deep learning architecture for inverse problems,” IEEE Transactions on Medical Imaging, vol. 38, no. 2, pp. 394- 405, Feb. 2019.
  • [28] S. H. Chan, X. Wang, and O. A. Elgendy, “Plug-and-Play ADMM for iImage restoration: fixed-point convergence and applications,” IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 84-98, Mar. 2017.
  • [29] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 157-166, Mar. 1994.
  • [30] C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2015.
  • [31] M. D. Zeiler and R. Fergus, “Visualizing and Understanding Convolutional Networks,” Lecture Notes in Computer Science, pp. 818-833, 2014.
  • [32] C. H. McCollough, A. C. Bartley, R. E. Carter, B. Chen, T. A. Drees, P. Edwards, D. R. Holmes, A. E. Huang, F. Khan, S. Leng, K. L. McMillan, G. J. Michalak, K. M. Nunez, L. Yu, and J. G. Fletcher, “Low-dose CT for the detection and classification of metastatic liver lesions: Results of the 2016 Low Dose CT Grand Challenge,” Medical Physics, vol. 44, no. 10, pp. e339-e352, Oct. 2017.
  • [33] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimiza- tion,” in Proc. Int. Conf. Learn. Represent., 2015. [Online]. Available: https://arxiv.org/abs/1412.6980.
  • [34] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising,” IEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3142-3155, Jul. 2017.