Medical image analysis via deep learning is heavily reliant on large-scale, labeled datasets Shen et al. (2017). Semi-supervised learning (SSL) has gained attention as an alternative to fully-supervised learning. SSL is for tasks which have datasets containing unlabeled samples, and there have been many works in such direction Anwar et al. (2018); Liu et al. (2020); Haque et al. (2021); Gu et al. (2019); Liu et al. (2020). However, in medical imaging, datasets which are fully labeled but extremely low in samples are more common. For example, x-rays are most often taken for diagnostic purposes, so they are generally accompanied with labels. The difficulty lies in aggregating these x-rays.
Restricted, fully-supervised datasets can be aided by deep generative models, as they can generate artificial samples to supplement full-supervised training. Generative Adversarial Networks Goodfellow et al. (2014) are a prominent form of generative models where a discriminator and generator work adversarially towards one another. The generator is trained to produce realistic images by learning to replicate the data distribution while the discriminator is trained to distinguish between real and fake samples. Structured as a 2-player game, only very few papers construct an adversarial framework with an additional neural classifier Li et al. (2017); Shen et al. (2018). To our best knowledge, none utilize classifier predictions on unlabeled generations to update the generator, especially using pseudo-labels Lee and others (2013). Also to our best knowledge, no methods in medical imaging use GANs to generate artificial samples for semi-supervised classification.
We introduce 3N-GAN, or 3 Network Generative Adversarial Network, for semi-supervised classification of X-ray images using generated samples as supplemental data. Our results confirm that our 3-player adversarial framework with innovative adversarial loss functions improves classifier and generator performance over baseline models at varying levels of supervision.
Figure 1 displays the adversarial training procedure for 3N-GAN. This method can be viewed as an extension of Haque (2021), which is only published as a 2-page abstract. Our method, 3N-GAN, introduces multiple significant improvements, such as truly incorporating the classifier into the adversarial framework, which the previous method did not attempt.
All 3 networks are trained simultaneously. The discriminator is trained conventionally. The generator is given a random latent vector as an input and outputs fake images. Contrary to many semi-supervised GAN classification methodsSalimans et al. (2016), we separate the discriminator and classifier. Achieving two tasks with a single network when the tasks are not very related, such as discrimination and classification, may not be optimal as the network must approximate two distributions Haque (2021).
The classifier is simultaneously trained on real images and artificial images produced by the generator. Generated images act as supplemental data for a classifier, increasing the amount of data available to the classifier. GAN generated samples do not have labels, requiring the use of semi-supervised algorithms. For semi-supervised classification, we use both pseudo-labeling Lee and others (2013) and KL divergence loss, which has not been used with GAN-based semi-supervised classification before. KL divergence loss ensures consistency by penalizing divergence between predictions on real and generated samples.
The classification loss objective
has a supervised component for the labels () and predictions () of real samples, an unsupervised component for the predictions on generated samples () and their pseudo-labels (), and a second unsupervised component computing KL divergence loss between the predictions on real samples and GAN generated samples. and are unsupervised loss weights and is the pseudo-labeling threshold.
To incorporate the classifier into the adversarial framework, we update the generator from the unsupervised classification loss. The generator now has a discriminator adversarial loss and a classifier adversarial loss. The generator objective
has the discriminator adversarial loss based on the discriminator predictions () on generated images and the classifier adversarial loss which is identical to the unsupervised classification loss from .
The classifier adversarial loss trains the generator to generate images whose class can be discerned accurately. Certain features that distinguish between classes, such as lung inflammations, may be more accurately produced in the generator’s samples. The classifier and generator provide feedback to one another, completing the 3-player adversarial framework. To our best knowledge, updating the generator on classification predictions using a pseudo-label is entirely novel and unavailable in literature.
3 Results and Conclusion
We experiment with a binary pneumonia classification dataset (CheX) Kermany et al. (2018). The dataset contains a total of 5,863 X-Rays with 624 as an external validation set. We perform experiments at various quantities of training data: 200, 500, 750, 1000, and 2000 X-rays (even split between classes). We use the DCGAN Radford et al. (2015) implementation for our generator and discriminator. Our classifier architecture is the DC discriminator. All inputs were normalized, gray-scaled, and resized to 64 64
1 before training. Each experiment was trained for 100 epochs with mini-batch size 10 and repeated 5 times. We compare against: a vanilla classifier, a multi-tasking discriminator, and EC-GAN. We set= 0.9, , = 0.3 and = 0.01 through tuning experiments.
|Multi-Tasking Discriminator Salimans et al. (2016)||89.06||92.12||91.85||94.12||94.85|
|EC-GAN Haque (2021)||90.3||92.44||93.10||94.95||95.35|
Table 1 displays the accuracy scores for diagnosing pneumonia of each model at the 5 different data settings. The results confirm the superiority of 3N-GAN over the other methods, as our model trained with just 200 images almost reaches the performance of a vanilla classifier trained with 2000 images. Compared to EC-GAN, we have slightly increased accuracy ( 1% raw score), proving the effectiveness of the adversarial classifier loss. Figure 2 compares fake X-rays from DCGAN and 3N-GAN to real X-rays. Visually, 3N-GAN generations are less blurry and more defined, proving the importance of our contributions. Moreover, key structures such as ribs and organs are sharp and present in 3N-GAN generations, and they resemble real X-rays more accurately than a DCGAN.
Ultimately, our preliminary results are quite promising, as 3N-GAN outperforms related methods in both diagnostic classification and x-ray image generation. In 3N-GAN, our classifier performance increases because the supplemental images supplied by the generator are of higher, improved quality. Our future work will include evaluations against even more methods and investigate additional changes to improve image generation.
4 Potential Negative Impacts
Potential negative impacts of our work include deepfake ethical violations related to GANs. GANs have been used for deepfakes Korshunov and Marcel (2018), which are AI generated media to replicate and falsify a particular individual’s likeness. Our GAN could be exploited to generate fake X-ray scans for patients. Synthetic X-rays could be used for false diagnostic results by patients or medical professionals. Fairness issues would also arise if our work were to be used in a clinical setting, such as determining which patients receive priority access to deep learning-based diagnostics. Regarding privacy, collecting patient data for deep learning-based methods poses patient privacy concerns. Another major concern results from false-positive or false-negative diagnoses on X-ray scans. While recent advancements suggest deep learning-based medical imaging analysis is reaching performance levels on par with professionals, if our classification algorithm predicted incorrect diagnoses, major liability and ethical issues would arise. This is a common challenge for deep learning medical imaging diagnostic models Aggarwal et al. (2021). By our interpretation, our work does not introduce any unique negative impacts aside from the long-existing challenges of deep learning-based medical imaging analysis. However, these challenges must be addressed before algorithms such as ours could be applied in clinical practice.
- Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ digital medicine 4 (1), pp. 1–23. Cited by: §4.
Medical image analysis using convolutional neural networks: a review. Journal of Medical Systems 42 (11), pp. 226. Cited by: §1.
- Generative Adversarial Networks. arXiv. External Links: Cited by: §1.
- CE-Net: context encoder network for 2D medical image segmentation. IEEE Transactions on Medical Imaging 38 (10), pp. 2281–2292. External Links: Cited by: §1.
- Multimix: sparingly-supervised, extreme multitask learning from medical images. In 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), pp. 693–696. Cited by: §1.
EC-gan: low-sample classification using semi-supervised algorithms and gans (student abstract).
Proceedings of the AAAI Conference on Artificial Intelligence35 (18), pp. 15797–15798. External Links: Cited by: §2, §2, Table 1.
- Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172 (5), pp. 1122–1131. Cited by: §3.
Deepfakes: a new threat to face recognition? assessment and detection. arXiv preprint arXiv:1812.08685. Cited by: §4.
Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, Vol. 3, pp. 896. Cited by: §1, §2.
- Triple generative adversarial nets. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 4091–4101. Cited by: §1.
- Semi-supervised Medical Image Classification with Relation-driven Self-ensembling Model. arXiv. External Links: Cited by: §1.
- Semi-supervised medical image classification with relation-driven self-ensembling model. IEEE transactions on medical imaging 39 (11), pp. 3429–3440. Cited by: §1.
- Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434. Cited by: §3.
- Improved techniques for training gans. In Advances in neural information processing systems, pp. 2234–2242. Cited by: §2, Table 1.
- Deep learning in medical image analysis. Annual review of biomedical engineering 19, pp. 221–248. Cited by: §1.
- Faceid-gan: learning a symmetry three-player gan for identity-preserving face synthesis. In , pp. 821–830. Cited by: §1.