Automatic classification between COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy on chest X-ray image: combination of data augmentation methods in a small dataset
Purpose: To develop and validate computer-aided diagnosis (CXDx) system for classification between COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy on chest X-ray (CXR) images. Because CXR datasets related with COVID-19 were small, transfer learning with pretrained models and combination of data augmentation methods were used to improve accuracy and robustness of the CADx system. Materials and Methods: From two public datasets, 1248 CXR images were obtained, which included 215, 533, and 500 CXR images of COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy. The proposed CADx system utilized VGG16 as a pretrained model and combination of conventional method and mixup as data augmentation methods. Other types of pretrained models were used for comparison with the VGG16-based model. In addition, single type or no data augmentation methods were also evaluated. Splitting of training/validation/test sets was used when building and evaluating the CADx system. Three-category accuracy was evaluated for test set with 125 CXR images. Results: The three-category accuracy of the CAD system was 83.6 COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy. In addition, sensitivity of COVID-19 pneumonia was more than 90 conventional method and mixup was more useful than single type or no data augmentation methods. Conclusions: It was possible to build the accurate CADx system for the 3-category classification of COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy.
READ FULL TEXT