Computed tomography (CT) is one of the most widely used medical imaging modality for showing anatomical structures [11, 5, 9, 8]. The foremost concern of CT examination is the associated exposure to radiation, which is known to increase the lifetime risk for death of cancer . The radiation dose can be lowered at the cost of image quality , and the resulted images are denoised for enhanced perceptual quality and diagnostic confidence from radiologists.
Various deep neural network (DNN) based methods exist for CT image denoising[6, 7, 2, 10], which require paired clean and noisy images for training. Yet simulations are usually used to generate such paired data, where the synthetic noise patterns can be different from the real ones, leading to biased training results . To address this issue, recently cycle-consistent adversarial denoising network (CCADN) was proposed in , which formulates CT image denoising as an image-to-image translation problem without paired training data. CCADN consists of two generators: one transforms noisy CT images (domain ) to clear ones (domain ) and the other transforms clear CT images (domain ) to noisy ones (domain ). Both generators are trained by adversarial loss. In addition, cycle-consistency loss and identity loss are utilized to gain better performance , which will be discussed in detail in Section 2. However, since CCADN only contains two domains and , its efficacy degrades as the noise becomes stronger leading to larger differences between and that are harder to learn.
To tackle this issue, we propose to establish an intermediate domain between the original noisy image domain and clear image domain , and decompose the denoising task into multiple coupled steps such that each step is easier to learn by DNN-based models. Specifically, we construct an additional domain with images of intermediate noise level between and . These images can be considered as a step stone in the denoising process and provide additional information for the training of the denoising network. The multi-step framework particularly suits the denoising problem: while it is difficult to either find or define a good collection of images in the “half-cat, half dog” domain in “cat-to-dog” type of image translation problems, a domain of images with intermediate level of noise exist naturally.
With the new domain , we further propose a multi-cycle-consistent adversarial network to perform the multi-step denoising, which builds multiple cycles of different scales (global cycles and local cycles) between the domains while enforcing the corresponding cycle-consistencies. In the experiments, we find that both global cycles and local cycles are necessary for the success of MCCAN, which combined outperforms the state-of-the-art competitor CCADN.
Given training images that are either labelled as noisy (domain ) or clear (domain ), we first construct a new domain which contains images with an intermediate noise level between and . How to obtain is flexible in practice. In our experiments, it is obtained from and by separating out those images with intermediate noise level.
With CT images from three domains, the multi-step denoising architecture of MCCAN is shown in Fig. 1
(a). We train four convolutional neural networks as generators and three as discriminators. Arrows in Fig.1(a) define how images are transformed in the training stage. Specifically, the generator aims to transform an image from to . , , and can be interpreted similarly. Discriminators , , and aim to distinguish the “real” images originally belonging to the domains , , and from the “fake” images transformed from other domains respectively.
As the MCCAN structure in Fig. 1(a) contains thee domains, there are multiple ways in which we can construct cycles (paths where an image from a source domain is transformed through one (in ) or several other domains (in this paper) and back to the source domain) for cycle-consistent loss. In particular, we introduce two types of cycles as shown in Fig. 1(b). In this figure, each dot represents an image, which is color-coded based on the domain. The solid ones represent the images originally in the domain (“real” ones), and the hollow ones represent those transformed from another domain (“fake” ones). As such, the dashed arrows form the local cycles, each of which goes across only two adjacent domains. On the other hand, the solid arrows constitute a global cycle that starts from through , , , and back to sequentially. Note that in the figure we only show half of the cycles (from left to right) for clarity, and the other half which are from right to left and symmetric to the ones shown also exist. We then enforce cycle-consistency loss, which measures the difference between the original images and the final images produced at the end of the cycle as represented by the small arrows within each domain in Fig 1(b). Ideally, the images transformed back to the source domain should be identical to the original ones. The cycle-consistency loss is applied to every cycle, no matter whether it is local or a global.
The global cycles are important for the denoising performance due to the following reason. In the inference stage, an input noisy CT image in domain will be transformed by and sequentially, which means and are coupled by data dependency. Without global cycles, and will be trained independently. Thus, errors of the prediction of noise at intermediate steps may be accumulated as processing progresses. The global cycles enable the joint training of the generators, which models the denoising path used in the inference stage for better consistency.
The local cycles are also important to address two issues in the training. First, the global cycles go through all the four generators and have long paths for the gradient to back-propagate, which makes the end-to-end optimization difficult. The locals cycles are shallow and have shorter paths for the gradient to back-propagate. Second, adversarial training only enforces the generators to output “fake” images identically distributed as the original “real” images in the intermediate domain . However, they do not necessarily preserve the meaningful content in the inputs, which is critical for the denoising task. The local cycle-consistency supervises each generator to learn to transform images while preserving their meaningful content from the inputs more easily.
In summary, our MCCAN has two major advantages over CCADN. First, it decomposes the one-step transform into multiple steps using images in a constructed intermediate domain as a step stone. Second, it not only incorporates global cycles that model the denoising path in the inference stage for consistency, but also uses local cycles that provide strong supervision to facilitate the more challenging training process. In the experiments we find that MCCAN outperforms CCADN.
Note that in the discussion so far, only one intermediate domain was assumed. It is also possible to include more than one intermediate domains with more global and local cycles. However, our study suggests that any additional domains beyond one will not introduce further performance gain in the dataset we explored.
Finally, we state the training objective used in our framework. Denote and as the set of generators and discriminators respectively. Denote as one domain and as the discriminator associated with domain . We let be a cycle and be a path of half that has the same source domain, where are used to distinguish different cycles and paths merely. For example, is a cycle, saying , thus we can have , and , which are both half cycles of . represents the set of all the paths that end at domain . We denote as the source domain of and as the ordered function composition of the generators in . Thus, the total adversarial loss is
where is the adversarial loss associated with domain and the transform path . is obtained by
where is the distribution of “real” images in the domain and
represents the probability determined bythat is a “real” image from domain rather than a “fake” one transformed by generators from another domain.
The cycle-consistency loss is associated with each , defined as
The final optimization problem we solve in the training stage is:
where is set to 10 in our experiments.
3 Experiments and Results
3.1 Experiments Setup
The original dataset contains 200 normal-dose 3D CT images and 200 low-dose ones from various patients for training, and separate 11 images for test. All examinations are performed with a wide detector 256-slice MDCT scanner (Brilliance iCT; Philips Healthcare) providing 8cm of coverage. Each 2D CT image is of size 512512, which is then randomly cropped into 256256 for data augmentation. We construct the additional domain with images of intermediate noise level from these clear and noisy scans to make the number of scans in each domain comparable. There are CT images with more noise than usual from clear scans that use high dose radiation, and vice versa, because the noise variation cannot be controlled quantitatively.
We compare MCCAN with a state-of-the-art CT denoising framework CCADN . In order to see how the local cycles and global cycles contribute to the final performance, we also implement and compare MCCAN without local cycles and without global cycles respectively as ablation study. The various structures are shown in Fig. 2. We train all the networks following the setting in 
. Our implementation will be available online. We ensure that all network sizes and number of training epochs are the same for fair comparisons.
3.2 Qualitative Evaluation
We choose three representative low-dose CT images in the test dataset as shown in Fig. 3(a) for qualitative evaluation. The corresponding denoised images by CCADN, MCCAN without local cycles, MCCAN without global cycles, and MCCAN are shown in Fig. 3(b)- 3(e) respectively. Numbered areas are homogeneous regions, while areas with edges between heterogeneous regions are zoomed for visibility in Fig. 3. From the figures we can see that CCADN can successfully reduce noise in the original images. MCCAN without local cycles completely fails to produce reasonable results. A more closer examination of the images reveal that interestingly the background and the substances are approximately swapped compared with the original images. This is because the high-level features of content distribution are still kept even with such swap, and the discriminator cannot identify the generated image as “fake” because of the structure diversity in the training dataset. This aligns with our discussion on the importance of local cycles in Section 2. On the other hand, MCCAN without global cycles can successfully denoise the image and achieves similar quality compared with CCADN. This is expected as MCCAN without global cycles is essentially formed by two cascaded CCADNs. Finally, with both local and global cycles, the complete MCCAN has the smallest noise visually.
|Method||Area #1||Area #2||Area #3||Area #4||Area #5|
|MCCAN w/o local cycles||0.02||0.38||0.24||1.02||0.11||0.71||0.02||0.42||0.28||1.31|
|MCCAN w/o global cycles||0.89||0.78||1.03||0.77||1.03||0.80||0.906||0.73||1.03||0.81|
To further illustrate the efficacy of the MCCAN structure, Fig. 4 shows how an image is transformed along a global cycle (the path XZYZX). From the figure we can see that is an effective two-step denoising process while incrementally adds noise back.
3.3 Quantitative Evaluation
, we use the mean and standard deviation (SD) of pixels in homogeneous regions of interest chosen by radiologists to quantitatively judge the quality of CT images. The mean value reflects substance information. Although the closer to that in the origin image the better, mean value can fluctuate within a range. On the other hand, the standard deviation reflects the noise level. It should be as low as possible, which is more sensitive than the mean value in the denoising task.
Five homogeneous areas chosen by radiologist are used for the quantitative evaluation, which are annotated by red rectangles in Fig. 3 and numbered from 1 to 5. The normalized quantitative results are shown in Table 1. CCADN can reduce the standard deviation in the five areas by 15%, 21%, 21%, 22% , and 22% respectively, with resulting mean values close to those of the original images. Although MCCAN without local cycles achieves smallest standard deviation in Areas 1, 3 and 4, it leads to meaningless output with large mean deviation from the original images, which corresponds to the structure loss in Fig. 3(c). MCCAN without global cycles has similar performance compared with CCADN. with mean values close to original and standard deviation reduction by 22%, 23%, 20%, 27%, and 19% respectively. Finally, the complete MCCAN behaves the best among all the methods: Within reasonable mean range, the standard deviations are decreased the most by 24%, 32%, 29%, 29%, and 32% from the original CT images respectively.
In this paper, we propose multi-cycle-consistent adversarial network (MCCAN) for CT image denoising. MCCAN builds intermediate domains and enforces both local and global cycle-consistency. The global cycle-consistency couples all generators together to model the whole denoising process, while the local cycle-consistency imposes effective supervision on the denoising process between adjacent domains. Experiments show that both local and global cycle-consistency are important for the success of MCCAN and it outperforms the state-of-the-art competitor.
-  (2014) Using “idose4” iterative reconstruction algorithm in adults’ chest–abdomen–pelvis ct examinations: effect on image quality in relation to patient radiation exposure. The British journal of radiology 87 (1036). Cited by: §3.3.
-  (2017) Low-dose ct denoising with convolutional neural network. In Biomedical Imaging (ISBI 2017), 2017 IEEE 14th International Symposium on, pp. 143–146. Cited by: §1.
-  (2018) Physician knowledge of radiation exposure and risk in medical imaging. Journal of the American College of Radiology 15 (1), pp. 34–43. Cited by: §1.
-  (2018) Cycle consistent adversarial denoising network for multiphase coronary ct angiography. arXiv preprint arXiv:1806.09748. Cited by: §1, Figure 3, §3.1.
-  (2019) Machine vision guided 3d medical image compression for efficient transmission and accurate segmentation in the clouds. In , pp. 12687–12696. Cited by: §1.
3-d convolutional encoder-decoder network for low-dose ct via transfer learning from a 2-d trained network. IEEE transactions on medical imaging 37 (6), pp. 1522–1534. Cited by: §1.
-  (2017) Generative adversarial networks for noise reduction in low-dose ct. IEEE transactions on medical imaging 36 (12), pp. 2536–2545. Cited by: §1, §3.3.
-  (2019) Whole heart and great vessel segmentation in congenital heart disease using deep neural networks and graph matching. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 477–485. Cited by: §1.
-  (2019) Accurate congenital heart disease modelgeneration for 3d printing. arXiv preprint arXiv:1907.05273. Cited by: §1.
-  (2018) Low-dose ct image denoising using a generative adversarial network with wasserstein distance and perceptual loss. IEEE transactions on medical imaging 37 (6), pp. 1348–1357. Cited by: §1, §3.3.
-  (2018) Structurally-sensitive multi-scale deep neural network for low-dose ct denoising. IEEE Access 6, pp. 41839–41855. Cited by: §1.
-  (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv preprint. Cited by: §1, §2, §3.1, Table 1.