According to the Centers for Disease Control and Prevention, 50,000 people die from pneumonia in the United States every year 111https://www.cdc.gov/dotw/pneumonia/index.html. Previous studies [8, 7, 10, 1] suggested that dual-energy chest radiographs (chest X-ray or CXR) can improve diagnostic accuracy for finding abnormalities, especially focal pneumonia over standard chest radiography. Dual-energy (DE) chest X-rays separate images of bones and soft tissues by making use of the differential reduction of low-energy X-ray photons by calcium. However, the acquisition of dual-energy chest X-rays increases the radiation dose to the patients and requires special, expensive equipment. As a result, researchers have been exploring methods to obtain bone suppressed chest X-rays from standard chest X-rays, and substantial progress has been made .
The bone suppression techniques could generally be categorized into deep learning and non-deep learning approaches. Non-deep learning approaches usually first locate the lung and ribs border and, then, use vertical intensity profiles to refine the final bone shadows. As deep learning is further developed for chest radiograph analysis [19, 18, 15, 16], the deep learning based bone suppression [9, 21, 3] also gradually gains more popularity since its greater power and flexibility to represent characteristics of different structures in the chest X-rays. One of the earlier deep learning approaches 
uses multiple massive-training artificial neural networks to first obtain a bone image from single energy chest X-ray. Then the bone image is subtracted from the single energy chest X-ray to obtain the virtual soft-tissue image.
We propose to use generative adversarial networks (GANs)  to learn bone suppression from dual-energy chest radiographs. GAN has gained much attention for its ability to generate realistic-looking synthetic images [14, 17]. GAN is composed of two networks, namely a generator and a discriminator. The generator creates images similar to the training set, and the discriminator tries to differentiate the true images from the training set and the fake images from the generator. When GAN is trained, the generator is able to generate images that are indistinguishable from the original training set. For our specific problem of bone suppression in the standard chest X-rays, the generator is able to learn a mapping from the standard chest X-rays to virtual soft-tissue images, by making use of dual-energy chest radiographs. In this work, we exploit two variations of GANs, namely, Pix2Pix  trained with patient-wisely paired radiographs and Cycle-GAN  trained with unpaired radiographs. Quantitative and qualitative experimental analysis verifies that image-to-image translation using adversarial learning is a feasible means to suppress bone structures and import minimal motion artifacts in standard radiographs. We also find that unpaired training using only posteroanterior (PA) chest radiographs yields better generalization ability on unseen anteroposterior (AP) radiographs.
To determine the feasibility and to compare the effectiveness of using variations of generative adversarial networks to suppress bones (e.g., ribs and clavicles) from standard frontal-view chest radiographs, by learning from paired or unpaired dual-energy chest radiographs.
Dual-energy subtraction imaging captures two or three radiographs of the same patient with different energy levels of X-ray exposures. One of the captured images highlights only the bones based on a specific energy level. Thus, the suppressed bone image can be estimated by combining the acquired standard chest X-ray image which includes both the soft tissue and bones and bone-only image. Therefore, we are motivated to utilize image-to-image translation techniques to translate a standard radiograph into a soft tissue only radiograph, thereby suppressing the bone structures. In this work, we adopt paired and unpaired training to accomplish this task (See Figure1).
2.1 Framework for Paired Image-to-Image Translation
We first adopt a variation of Pix2Pix  for the paired training of generative adversarial networks (GANs). The Pix2Pix model works by training on pairs of images, in this work, namely standard CXRs to bone suppressed CXRs, and then attempts to generate a corresponding output CXR without bones from a standard CXR. The Pix2Pix model is a type of conditional GAN, where the generation of the output CXR is conditional on an input CXR, in this case, a source image . The discriminator sees both the source CXR and the generated target CXR and decides if it is a ground truth CXR or from the generator . The generator tries to minimize the pairwise
distance and generate plausible bone suppressed CXRs to fool the discriminator. The adversarial loss function is:
Please refer to Pix2Pix  for more details about the model training.
2.2 Framework for Unpaired Image-to-Image Translation
Pix2Pix works only when two image spaces are pre-formatted into a single image that held both tightly-correlated images. However, in clinical scenarios, there are far more standard, conventional radiographs than paired DE radiographs. In this case, we propose to use unpaired CXRs from source domain and target domain to train an unsupervised image-to-image translation model for bone suppression. Therefore, we adopt Cycle-GAN  for this task. A CycleGAN consists of two generators and two discriminators. The discriminators and classify an input CXR as real or fake. encourages the generator to learn the mapping and translate source CXRs into outputs indistinguishable from target domain , and vice versa for and . In addition to adversarial losses, two cycle-consistency losses, namely forward cycle-consistency loss and backward cycle-consistency loss, are also used to regularize the model to ensure the transform from one domain to the other and back again to the original domain. The advantage of unpaired training is that the direct correspondence between individual CXRs is not required in two domains. Thus unpaired training might be more robust to unseen CXRs not closely aligned with the source distribution, e.g., anteroposterior CXRs unseen in the training. Please refer to Cycle-GAN  for more details about the loss functions and unpaired training.
3 Experimental Results
In this study, we experiment with two different datasets. First, we train and evaluate the image-to-image translation models for bone suppression on a dataset of 1,867 anonymized dual-energy PA chest radiographs collected from the picture archiving and communication system (PACS) of our institute. This dataset is randomly split to 7:1:2 for training, validation, and testing. We evaluate the models with two different objective image quality metrics, namely Structural Similarity Index (SSIM)  and Peak Signal-to-Noise Ratio (PSNR) . Then, we use the trained models on this dataset to generate bone suppressed radiographs on a subset 222https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/data of the NIH chest X-ray dataset  with both PA and AP radiographs, containing 8,525 normal radiographs and 17,159 radiographs with abnormalities. Among these radiographs on the second dataset 
, 1,532 normal and 3,000 abnormal images are used to evaluate the binary classification performance, in terms of AUC (the area under the receiver operating characteristic curve), of the bone suppression models. The two datasets are denoted as “dual-energy dataset” and “standard dataset”, respectively. We set the input and output image size to be 512 pixels as a trade-off between better image quality and affordable computational load for both paired and unpaired training and testing. The work-flow is shown in Figure2.
3.2.1 Quantitative Results
We quantitatively evaluate the quality of generated bone suppressed radiographs by comparing them with the soft-tissue only images of the “dual-energy” test set, and then evaluate the normal versus abnormality binary classification results (using VGG-19  as a classifier) on the “standard” test set using generated images. The results are shown in Table 1. The framework trained with paired images slightly outperforms the unpaired counterpart on the dual-energy dataset. But when the framework is extended to the standard dataset which contains both PA and AP radiographs, the unpaired training shows better generalization ability, given the fact that the classification result is higher than the paired training. We could not evaluate the SSIM and PSNR on the standard dataset since ground-truth soft tissue images on this dataset are unavailable.
3.2.2 Qualitative Results
We show some examples of bone suppressed CXRs generated by the paired and unpaired training frameworks on two different datasets in Figure 3. As can be seen from the figure, both paired and unpaired training of GANs are able to generate bone suppressed CXRs of high quality for PA CXRs. We find that the GAN models introduce minimal motion artifacts compared with the dual-energy subtraction technique. The reason is that there is only a small portion of training data in the DE dataset contains motion artifacts. The image-to-image translation models tend to learn the majority of information from the entire data distribution. This characteristic of adversarial learning models can be considered as yet another main advantage of automatic bone suppression in addition to less radiation exposure. The visualized results on AP CXRs also showed the superiority of Cycle-GAN trained with unpaired CXRs over Pix2Pix trained with paired data. A possible reason is that Cycle-GAN is not strictly constrained by paired CXRs in the training, leading to better generalization to unseen AP radiographs than Pix2Pix.
We proposed to use generative adversarial networks to learn to suppress bone structures on chest radiographs. Experimental evaluations on two different NIH chest X-ray datasets validate the effectiveness of the framework on suppressing bones. The framework trained with unpaired posteroanterior radiographs generalized better to unseen anteroposterior radiographs, showing great potential to facilitate image interpretation in clinical scenarios where both PA and AP radiographs exist. As proof of concept, we focused our evaluations on bone suppression. But this framework can be readily extended to a wider range of applications such as bone fracture or lesion detection on bone images generated using the adversarial learning method.
This research was supported by the Intramural Research Program of the National Institutes of Health Clinical Center and by the Ping An Technology Co., Ltd. through a Cooperative Research and Development Agreement. The authors thank NVIDIA for GPU donation.
-  (2019) When does bone suppression and lung field segmentation improve chest x-ray disease classification?. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), pp. 1362–1366. Cited by: §1.
-  (2014) Bone suppression in chest radiographs by means of anatomically specific multiple massive-training anns combined with total variation minimization smoothing and consistency processing. In Computational Intelligence in Biomedical Imaging, pp. 211–235. Cited by: §1.
-  (2019) Image to images translation for multi-task organ segmentation and bone suppression in chest x-ray radiography. arXiv preprint arXiv:1906.10089. Cited by: §1.
-  (2014) Generative adversarial nets. In Advances in Neural Information Processing Systems, Cited by: §1.
-  (2014) Bone suppression technique for chest radiographs. In Medical Imaging 2014: Image Perception, Observer Performance, and Technology Assessment, Vol. 9037, pp. 90370D. Cited by: §1.
Image-to-image translation with conditional adversarial networks. In , pp. 1125–1134. Cited by: §1, §2.1.
-  (2012) Improved detection of focal pneumonia by chest radiography with bone suppression imaging. European radiology 22 (12), pp. 2729–2735. Cited by: §1.
-  (2008) Dual energy subtraction and temporal subtraction chest radiography. Journal of thoracic imaging 23 (2), pp. 77–85. Cited by: §1.
-  (2018) Learning bone suppression from dual energy chest x-rays using adversarial networks. arXiv preprint arXiv:1811.02628. Cited by: §1.
-  (2014) Bone suppression increases the visibility of invasive pulmonary aspergillosis in chest radiographs. PloS one 9 (10), pp. e108551. Cited by: §1.
-  (2010) Study of subjective and objective quality assessment of video. IEEE Transactions on Image Processing 19 (6), pp. 1427–1441. Cited by: §3.1.
Augmenting the national institutes of health chest radiograph dataset with expert annotations of possible pneumonia.
Radiology: Artificial Intelligence1 (1), pp. e180041. Cited by: §3.1.
-  (2015) Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations, Cited by: §3.2.1.
-  (2019) CT-realistic data augmentation using generative adversarial network for robust lymph node segmentation. In Medical Imaging 2019: Computer-Aided Diagnosis, Vol. 10950, pp. 109503V. Cited by: §1.
-  (2019) XLSor: a robust and accurate lung segmentor on chest x-rays using criss-cross attention and customized radiorealistic abnormalities generation. In Proceedings of The 2nd International Conference on Medical Imaging with Deep Learning, Vol. 102, pp. 457–467. Cited by: §1.
-  (2019) Deep adversarial one-class learning for normal and abnormal chest radiograph classification. In Medical Imaging 2019: Computer-Aided Diagnosis, Vol. 10950, pp. 1095018. Cited by: §1.
-  (2019) TUNA-Net: task-oriented unsupervised adversarial network for disease recognition in cross-domain chest x-rays. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 431–440. Cited by: §1.
-  (2018) Attention-guided curriculum learning for weakly supervised classification and localization of thoracic diseases on chest radiographs. In Machine Learning in Medical Imaging, pp. 249–258. Cited by: §1.
-  (2017) ChestX-ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2097–2106. Cited by: §1, §3.1.
-  (2004) Image quality assessment: from error visibility to structural similarity. IEEE transactions on Image Processing 13 (4), pp. 600–612. Cited by: §3.1.
Cascade of multi-scale convolutional neural networks for bone suppression of chest radiographs in gradient domain. Medical image analysis 35, pp. 421–433. Cited by: §1.
-  (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232. Cited by: §1, §2.2.