COVIDLite: A depth-wise separable deep neural network with white balance and CLAHE for detection of COVID-19

06/19/2020 ∙ by Manu Siddhartha, et al. ∙ 0

Background and Objective:Currently, the whole world is facing a pandemic disease, novel Coronavirus also known as COVID-19, which spread in more than 200 countries with around 3.3 million active cases and 4.4 lakh deaths approximately. Due to rapid increase in number of cases and limited supply of testing kits, availability of alternative diagnostic method is necessary for containing the spread of COVID-19 cases at an early stage and reducing the death count. For making available an alternative diagnostic method, we proposed a deep neural network based diagnostic method which can be easily integrated with mobile devices for detection of COVID-19 and viral pneumonia using Chest X-rays (CXR) images. Methods:In this study, we have proposed a method named COVIDLite, which is a combination of white balance followed by Contrast Limited Adaptive Histogram Equalization (CLAHE) and depth-wise separable convolutional neural network (DSCNN). In this method, white balance followed by CLAHE is used as an image preprocessing step for enhancing the visibility of CXR images and DSCNN trained using sparse cross entropy is used for image classification with lesser parameters and significantly lighter in size, i.e., 8.4 MB without quantization. Results:The proposed COVIDLite method resulted in improved performance in comparison to vanilla DSCNN with no pre-processing. The proposed method achieved higher accuracy of 99.58 96.43 methods. Conclusion:Our proposed method, COVIDLite achieved exceptional results on various performance metrics. With detailed model interpretations, COVIDLite can assist radiologists in detecting COVID-19 patients from CXR images and can reduce the diagnosis time significantly.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 8

page 12

page 17

page 18

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

COVID-19 started to spread since December 2019, when a large number of pneumonia cases of unknown cause occurred in Wuhan, Hubei, China, whose clinical characteristics much resembled with viral pneumonia. Detailed analysis by observing samples indicated a novel virus named as coronavirus or COVID-19. Since then, the number of cases increased extremely fast, with death count increased at an alarming rate. The critical observation in casualty cases from COVID-19 was that all the patients had developed into severe pneumonia. The primary strategy for containing its spread and reducing the death count is early diagnosis of COVID-19 cases so that patients can be timely quarantined, which prevents them from spreading the virus to others  Adhikari et al. (2020). The first and the most common diagnostic test for diagnosing COVID-19 is a real-time reverse transcription-polymerase chain reaction (RT-PCR). However, the RT-PCR test has three significant issues. First, it is a time-consuming process  Yi et al. (2020); second, it has lower sensitivity of 60%–70%,  Udugama et al. (2020), i.e., around 30% of the cases where the test diagnosed negative to actual positive cases is a critical issue; third, it requires the availability of commercial kits  Udugama et al. (2020). However, in those misdiagnosed cases, symptoms can be detected when examined through radiological images  Kanne et al. (2020)  Xie et al. (2020). So, other alternative diagnostic tests that are relatively accurate with higher sensitivity and faster in terms of test results are significant in the early diagnosis of COVID-19 patients. According to some of the researchers, combined features of radiological images with other laboratory tests may help in early-stage identification of COVID-19  Kong and Agarwal (2020)  Lee et al. (2020)  Shi et al. (2020)  Zhao et al. (2020)  Li and Xia (2020). In some of the studies, researchers found a useful pattern in chest CT and CXR images, which were very crucial in the identification of COVID-19  Kong and Agarwal (2020)  Zhao et al. (2020)  Yoon et al. (2020). Chest CT is more sensitive to COVID-19, but we have employed CXR in this research due to its easier availability, lower cost, and lesser waiting time for results.

Convolutional Neural Networks (CNNs) have recently shown to exceed the medical practitioner’s performance in diagnosing pneumonia from CXR images due to advancements in processing power and the availability of large image datasets. In  Stephen et al. (2019)

, researchers have used seven layers of deep convolutional neural networks (DCNNs) for detecting pneumonia from CXR images. In this method, they have employed the ReLU activation function between hidden layers and a sigmoid activation function for performing the classification task. They have achieved higher validation accuracy for the image size of 200 x 200. In  

Chhikara et al. (2020)

, a DCNN with transfer learning used to detect pneumonia from CXR images. In this method, researchers have used advanced image processing techniques such as gamma correction, equalization, filtering, and compression to enhance image quality, which improved the overall performance of their model.

Recently, a significant amount of research was conducted for diagnosing COVID-19 from CXR images using DCNNs, and have shown exceptional results.  Sethy and Behera (2020)

have used a transfer learning approach for detecting COVID-19 from CXR images. They have employed Support Vector Machine for extracting deep features, and pre-trained network ResNet-50 for classification, resulting in higher mean accuracy of 95.33%. In another study  

Loey et al. (2020), researchers have used generative adversarial network (GAN) to increase training images of four class types COVID-19, normal, viral pneumonia, and bacterial pneumonia, whereas pre-trained networks such as Alexnet, Googlenet, and Restnet18 used for the classification task. They have increased the size of their existing dataset to 30 times, which resulted in higher accuracy and sensitivity of the models.  Hemdan et al. (2020) proposed a new framework named COVID X-Net capable of detecting COVID-19 from CXR images using seven pre-trained networks namely, VGG19, DenseNet121, ResNetV2, InceptionV3, InceptionResNetV2, Xception and MobileNetV2.

In this paper, we have developed a hybrid method comprising a deep CNN model based on depth-wise separable convolutions and white balance followed by CLAHE for image enhancement as a data preprocessing step. In previous studies, different novel data preprocessing techniques proved useful in many applications such as hand gesture recognition using radar systems  Hazra and Santra (2018), hand gesture recognition using ultrasound imaging and speech recognition  Picone (1993) among others. DSCNNs first used in Xception Net has shown exceptional results in the classification task of natural images. However, in the case of medical images, much adjustment is required as medical images can be available in 3 dimensions RGB, 4 dimensions, or 2-dimensions with grayscale, whereas natural images are mostly available in 2-dimensional RGB formats. The other major difference is in terms of intensity variations, furthermore natural images are generally recognized by edges, basic shape, correlation among neighboring pixels etc., whereas each pixel intensity or brightness level are not relevant feature for their recognition. However, in the case of medical images, specifically CXR images, each pixel intensity is essential in locating affected regions within the image, which is significantly crucial for detecting abnormalities within CXR images. Thus, proper image preprocessing is essential for building a robust and accurate model for detecting abnormalities in medical images.

1.1 Contributions and Outline

The contribution of this study can be summarized as follows:

  1. A novel approach comprising of white balance with CLAHE followed by novel DSCNN model is proposed for the detection of COVID-19 from CXR images,

  2. The proposed method is shown to outperform traditional DSCNN without preprocessing and other recent state-of-the-art methods in terms of accuracy in detecting COVID-19 in both 2-class and 3-class classification tasks,

  3. The proposed method is significantly lighter in terms of model size making it favorable for deploying as a web app for generating real-time diagnosis of COVID-19 in inaccessible areas,

  4. We analyze the intepretability of the proposed model using Grad-CAM, LIME, and saliency maps, which in turn can provide key insights to radiologists for accurate diagnosis of COVID-19.

The paper is organized as follows. In section 2, we detailed about depth-wise separable CNNs and the method employed in this research. Further, we also presented the architecture of the proposed method. In section 3, we describe the dataset utilized in this research, along with performance measures used for evaluating our method. In section 4, we have presented our results with a visual interpretation of our model’s prediction and discussed the previous method used for the detection of COVID-19 and their comparison with the proposed method. Section 5 provides the conclusion of this research.

2 Methodology

2.1 Depth-wise separable CNN (DSCNN)

Convolutional Neural Networks (CNNs) inspired from the working of human brain’s visual cortex. One of the key reason for the success of CNN is the availability of large training dataset, also their ability to learn distinctive features implicitly from the input image. However, deep CNNs come with the drawback of large memory size and compute requirements, which can be deterrent for real-time applications such as on embedded platform or web application. Further, DCNNs with large parameters require large dataset to avoid overfitting, which is a problem for medical imaging, where datasets are limited. One of the solution for the above problem is depth-wise separable convolution (DSC) first proposed by Chollet  Chollet (2017), which divides the regular convolution operation into two separate operations depth-wise or spatial convolution and sequential point-wise convolution, as shown in Fig. 1. The depth-wise convolution operation applies a 2D filter to each input channel. Since a single filter is applied to each input channel, it reduces the number of computations significantly as only the input channel corresponding to each filter is computed. Further, convolution or point-wise convolution is used to aggregate the results of depth-wise convolution for creating a new feature map, as shown in Fig. 1(b). Thus resulting in significant reduction in parameters and computation cost, making them favorable candidate for real-time inference on mobile and embedded platforms.

The standard convolution is defined as

(1)

where is the weight kernel, is the input feature map, are the height, width and channel dimension, and denotes the standard convolution operation.

The point-wise , depth-wise and depth separable convolution are defined as follows

(2)
(3)
(4)

where and represents the different weights for point-wise and depth-wise convolution, and represents element-wise multiplication.

The depth seperable convolution drastically reduces the number of parameters compared to regular convolution as

(5)

where M denotes number of input channels, N denotes number of output channels and represents kernel size. For e.g., if we have 48 input channels (M), 96 output channels (N) and kernel of size 3 () this would result in 41,472 parameters for regular convolution compared to only 5040 parameters for depth-wise separable convolution, meaning 87.84% reduction in parameters.

Figure 1: Comparison of (left) standard convolution and (right) depth seperable convolution operation.

2.2 Proposed Method

The methodology proposed in this study for detection of COVID-19 pneumonia from CXR images is depicted in Fig. 2.

Figure 2: Training methodology of the proposed neural network solution (COVIDLite)

As shown in Fig. 2, the overall methodology for training and evaluating the proposed method (COVIDLite) are described in the following steps -

Step 1: Collected the labeled image dataset from different sources and imported them for further pre-processing.

Step 2: In this step image pre-processing performed using white balance followed by Contrast Limited Adaptive Histogram Equalization (CLAHE) technique. The White balance and CLAHE method are described below.

White Balance: White Balance is the image processing operation applied to adjust proper color fidelity in a digital image. Due to low lighting conditions in medical images, some of the parts of the image appeared dark and the image capturing equipment does not detect light precisely as the human eye does. Due to this, image processing or correction help to ensure that the final image represents the colors of the natural image. The objective of this operation is to enhance the visibility of the image so that DCNNs could extract useful features from the image. The white balance algorithm adjusts the colors of the active layers of the image by stretching red, green, and blue channels independently. For doing this, pixel colors discarded, which are at the end of the three channels and are used by only 0.05% of the pixels in the image, while stretching is performed for the remaining color range. After this operation, pixel colors infrequently present at the end of the channel could not negatively influence the upper and lower bound values while stretching  42  41. In this solution, we have implemented a white balance algorithm in python language using NumPy and OpenCV library.

The steps of the White balance algorithm can be summarized as

(6)

where represents the taking the percentile of channel , and Clip(., min, max) operation depicts performing saturation operation within min and max values. , denotes the input and updated channels pixel values after the operation respectively.

CLAHE: is an effective contrast enhancement method that effectively increases the contrast of the image. CLAHE is an improved version of the adaptive histogram equation (AHE). Histogram equalization is the simple method for enhancing the contrast of the image by spreading out the intensity range of the image or stretching out the most frequent intensity value of the image. Stretching the intensity values changes the natural brightness of the input image and introduces some undesirable noise in the image  Pizer et al. (1987)

. In AHE, the input image split into several small images, also known as tiles. In this method, the histogram of each tile computed, which corresponds to different sections of the image and uses them to derive intensity remapping function for each tile. This method introduces noise in the image due to over amplification. CLAHE works precisely the same as AHE, but it clips the histogram at specific values for limiting the amplification before computing the cumulative distributive function. The overamplified part of the histogram is further redistributed over the histogram as shown in Fig.

3  Pizer et al. (1987). In one of the previous studies  Pizer (1990), CLAHE showed exceptional results in enhancing chest CT images and considered useful in examining a wide variety of medical images. The computation of CLAHE  Wong et al. (2014) is performed as

(7)

where, represents pixel value after applying CLAHE, , represents maximum and minimum pixel value of an image respectively and

represents cumulative probability distribution function.

Figure 3: Working principle of CLAHE showing original and clipped histogram

The preprocessed images are presented in Fig. 4, which depicts that white balance and CLAHE have significantly enhanced the clarity of the image and better represented the medical CXR images, which otherwise were not well represented in the original images.

Figure 4: Pre-processed images after applying white balance followed by CLAHE

Step 3:

This step performs image normalization followed by the resizing of the image. Normalization is an important step that ensures that each pixel in the image has a uniform distribution. It helps in faster convergence while training a deep neural network. In this step, the image pixels are normalized by 255, which scales the pixel values between (0,1). Further, the image is re-sized to

pixels before feeding the image to the input layer of DCNN.

Step 4: The dataset is split in the ratio of 80%-20% for training and testing respectively.

Step 5:

The proposed DSC network comprising of 2 convolution layers and 12 depth-wise separable convolution layers is trained. Each convolution layer followed by a max-pooling layer, whereas the depth-wise convolution layer followed by batch normalization and max-pooling layer. In the end fully connected dense layers followed by softmax layer and output layer.

Step 6:

K-fold cross-validation of DCNN performed with 100 epochs for each fold. The training set is divided into k folds, and the network trained on k-1 parts while validation performed on the remaining 1 part. This step is repeated k times (5 in our case), and the network performance reported by taking the average of k fold predictions.

Step 7:

Test the network on test data by performing network evaluation using different performance metrics such as plotting confusion matrix, calculating accuracy, precision, sensitivity, specificity, F1-score, and average area under the curve (AUC) for each class type. Furthermore, the t-SNE of the images and the saliency maps of the images are performed for interpretability of the NN.

2.3 Architecture of proposed COVIDLite

The proposed network of COVIDLite, comprises of a combination of two separate convolution blocks. The input of the network is a channel CXR image, followed by 18 convolution layers. The size of the kernel used in the network chosen to be small, i.e. . The kernel size of also proposed in XceptionNet architecture,  Chollet (2017), which resulted in higher accuracy than InceptionV3 network  Szegedy et al. (2016)

when tested on large image classification dataset, ImageNet dataset  

Deng et al. (2009), comprising 350 million images and 17,000 classes. In the first block, two convolution layers followed by a max-pooling layer of size . In the second convolution block, two convolution layers are followed by batch normalization, further followed by a max-pooling layer of size . The problem of traditional deep neural networks with high learning rate results in the gradient that may explode or vanish or may stuck in sub-optimal local minima  Ioffe and Szegedy (2015) . One of the significant problems with the deep neural net is the problem of overfitting. We have solved the problem of overfitting by employing dropout layers proposed in  Srivastava et al. (2014) . We have solved the problem of overfitting by employing dropout layers. We have employed dropout layers after sixth separable convolution layer with dropout ratio of 0.2, which further increased in the range of (0.7 to 0.2) is fully connected dense layers. The detailed architecture of the deep CNN model used in the proposed method shown in Tab. 1.

Layer Name O/P shape Parameters Kernel Size Dropout Filters
Input (224,224,3) 0 - 0 -
Conv2D x 2 (224,224,16) 2768 3 x 3 0 4
Maxpool2D (112,112,16) 0 - 0 -
Separable Conv2D x 2 (112,112,32) 2032 3 x 3 0 32
Batch Norm. (112,112,32) 128 - 0 -
Maxpool2D (56,56,32) 0 - 0 -
Separable Conv2D x 2 (56,56,64) 7136 3 x 3 0 64
Batch Norm. (56,56,64) 256 - 0 -
Maxpool2D (28,28,64) 0 - 0.2 -
Separable Conv2D x 2 (28,28,128) 26560 3 x 3 0 128
Batch Norm. (28,28,128) 512 - 0 -
Maxpool2D (14,14,128) 0 - 0.2 -
Separable Conv2D x 2 (14,14,256) 102272 3 x 3 0 256
Batch Norm. (14,14,256) 1024 - 0 -
Maxpool2D (7,7,256) 0 - 0.2 -
Separable Conv2D x 2 (7,7,256) 136192 3 x 3 0 256
Batch Norm. (3,3,256) 1024 - 0 -
Maxpool2D (3,3,256) 0 - 0.2 -
Separable Conv2D x 2 (3,3,512) 401152 3 x 3 0 512
Batch Norm. (3,3,512) 2048 - 0 -
Maxpool2D (1,1,512) 0 - 0.2 -
FC1 (ReLU) (512) 262656 - 0.7 512
FC2 (ReLU) (128) 65664 - 0.5 128
FC3 (ReLU) (64) 8256 - 0.3 64
FC4 (ReLU) (32) 2080 - 0.2 32
FC5 (Softmax) (3) 33 - 0 3
Table 1: Detailed neural network architecture of the proposed method (COVIDLite)

2.3.1 Activation function

The selection of activation function plays a key role in convergence of the deep neural network. In our proposed solution, we have used ReLU activation function in hidden layers of the network. ReLU outputs 0 when input value x < 0, whereas outputs linear function when . In the last layer of the deep CNN model, softmax function is used for classification. Softmax maps the output of the last layer of the network to the normalized probability distribution over the predicted output classes. Softmax function is defined as

(8)

where defines the number of classes to be predicted.

2.3.2 Loss function

As we have multi-class classification problem and the labels are mutually exclusive i.e., each sample belongs to only one class at a time so we have used sparse categorical cross entropy loss function. It is a variant of cross entropy loss function. The basic difference between two loss function is in terms of practical implementation. In case of sparse categorical cross entropy loss function integer labels are used for calculating loss against ground truth unlike one-hot encoded vectors in case of categorical cross entropy loss function which makes them memory efficient by saving lots of computation in terms of log operation and results in faster execution. The sparse categorical cross entropy loss function is defined as

(9)

where, represents model parameters, denotes true labels in integers, denotes predicted value by model and denotes the total number of classes

3 Dataset and Performance Measures

3.1 Dataset

The dataset used in this study consists of 1823 images of an annotated poster anterior(PA) view of CXR images. Labeled Optical Coherence Tomography (OCT) and CXR Images  Kermany et al. (2018) used for viral pneumonia and non-pneumonia or normal cases, whereas three different datasets  P et al. (2020); Chen (2020); et.al (2020) used for viral pneumonia and non-pneumonia or normal cases, whereas three different datasets used for COVID-19 cases. The dataset consists of 536 images of COVID-19, 619 images of viral pneumonia, and 668 images of normal cases. The age range of COVID-19 cases in the dataset is 18-75 years—the detailed specification of images used in the dataset depicted in Tab. 2. As shown in Tab. 2, COVID-19 images have a considerable variation in height and width with maximum height and maximum width compared to other class of images. Figure 5 shows the sample images of normal, viral pneumonia, and COVID-19 cases. As per Fig. 5, the normal CXR image depicts clear lungs without any area of abnormal opacification or pattern in the image, viral pneumonia (middle) manifests with a more diffuse ”interstitial” pattern in both lungs and COVID-19 (extreme right) depicts ground-glass opacification and consolidation in the right upper lobe and left lower lobe. For training deep CNN model, the dataset divided into a ratio of 80-20, where 80% of the dataset used for training the model while 20% used for testing the model. The distribution of images in the training and test set shown in Tab. 3

. For the analysis of CXR images, all CXR images initially screened for quality control by filtering all low quality or unreadable scans for building efficient deep learning models.

Figure 5: Sample chext X-ray images of Normal, Viral Pneumonia and COVID-19 cases
Image Class Min. width Max. width Min. height Max. height
Normal 1040 2628 650 2628
COVID-19 240 4095 237 4095
Viral Pneumonia 384 2304 127 2304
Table 2: Detailed specification of images in the dataset
Image Class Training Set Test Set
Normal 534 134
COVID-19 429 107
Viral pneumonia 495 124
Total 1458 365
Table 3: Distribution of images in training and test set for 3-class problem

Figure 6 demonstrates the separability among different classes i.e., Normal, COVID-19 and viral pneumonia using t-SNE plot on the normalized input image data.

Figure 6: t-SNE plot depicting separability among different classes of input image data in the dataset

3.2 Performance Measures

For evaluating the performance of the proposed solution, several performance metrics are chosen, i.e., sensitivity, specificity, precision, F1-score, accuracy, an average area under the curve (AUC), and cohen’s kappa score. First, confusion matrix plotted based on the proposed model’s predictions to assess the number of instances correctly classified and misclassified by the model. The confusion matrix consists of true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN). True positives (TP) are the number of disease cases correctly classified by the model, true negatives (TN) are actual number of normal cases correctly classified as normal. False positives (FP) are the actual number of normal cases misclassified as a disease. False negatives (FN) are the number of disease cases misclassified as normal. The definitions of sensitivity, specificity, precision, F1-score, and accuracy are as follows -

(10)

where, represents relative observed agreement between ground truth and the model, represents expected agreement between ground truth and the model which determines superiority of the model in comparison to simple random guess based on distribution of each class.

The receiver operating characteristic (ROC) curve is the graph plotted between True Positive Rate (TPR) and False Positive Rate (FPR). TPR and FPR are defined as

(11)

Average Area Under the ROC curve (AUC) measures the two-dimensional area under the ROC curve from point (0,0) to (1,1). For multi-class problem, AUC is computed as an extension of multi-class AUC as proposed in Tang et al. (2011).

Furthermore, Grad-CAM maps, and Locally model-agnostic explanations visualization methods are used for interpretability of the trained neural network and gain insights of the featuers the neural network is basing its decisions on.

4 Experimental results

4.1 Experimental settings

In this paper, we have keras library with tensorflow as backend is used for building the deep CNN model. The Linux operating system is used with Intel Xeon CPU E3-1225, 3.3 GHz, and 8 GB RAM. For evaluating the proposed method, the train-cross-validation-test scheme is used in which training of deep CNN model performed on the training set, 5-fold cross-validation used for tuning the hyper-parameters of the model and finally test set used for evaluating the proposed method. The deep CNN model trained using a mini-batch size of 8. The Adam optimizer used with weight decay in terms of initial learning rate / total number of epochs. The initial learning rate to be taken as 0.001 and the maximum number of epochs to train the network was 100.

4.2 Results

In this section, we have demonstrated the performance of the proposed method using different performance metrics. Table 4 showed the summary of the 5-fold cross-validation accuracy of the proposed COVIDLite method. As shown in Table 4, the average 5-fold cross-validation accuracy in the case of binary classification reported as 98.02% slightly higher than multi-class classification accuracy of 97.12%. Table 5 demonstrates the class-wise performance of the proposed method for multi-class classification problems. The macro and weighted average results calculated using the classification report function of the sklearn library in python. The macro and weighted average F1-score of the model is 96%. As shown in Table 5, the model is highly specific and highly sensitive in predicting COVID-19 cases.

Number of Folds Accuracy (2-class) Accuracy (3-class)
Fold 1 98.26% 94.52%
Fold 2 99.48% 96.23%
Fold 3 98.96% 96.58%
Fold 4 100.00% 98.97%
Fold 5 92.71% 99.31%
Average 98.02% ( 2.69%) 97.12% ( 1.79%)
Table 4: Cross -validation performance of the proposed method for 2-class vs 3-class problem
Parameters Normal COVID-19 Viral Pneumonia
Precision 96% 98% 95%
Sensitivity 97% 97% 95%
Specificity 99.11% 99.18% 94.58%
F1-Score 97% 98% 95%
Class error (95% CI) (0.00103, 0.05866) (0.00324, 0.05930) (0.01061, 0.08614)
AUC 0.99 1.0 1.0
Table 5: Class-wise performance of the proposed method for 3-class problem

In Table 6

, we demonstrated the performance summary of the proposed COVIDLite model with and without pre-processing step, i.e., White balance followed by CLAHE for both 2-class and 3-class scenario. As per the table proposed method with pre-processing resulted in improved performance of approx 1% in accuracy, precision and F1-score, and approx 2% improvement observed in cohen’s kappa score for the 3-class problem. In comparison, for binary classification, approx 1% improvement observed in accuracy, precision, specificity, and approx 2% increase observed in terms of cohen’s kappa score. In both the binary and multi-class problem significant increase observed in performance measures after applying White balance followed by CLAHE as well as 95% confidence intervals are narrower in comparison to method with no pre-processing step i.e., White balance followed by CLAHE.

Parameters 2-class
(no WB+CLAHE) 2-class
(proposed) 3-class
(no WB+CLAHE) 3-class
(proposed)
Accuracy 98.75 99.58 95.34 96.43
Precision 99.00 100.00 96.00 97.00
Sensitivity 99.00 99.58 96.00 96.00
Specificity 98.25 99.34 97.79 97.89
F1-Score 99.00 99.79 95.00 96.00
Class error (95% CI) (0.00155, 0.02643) (0.0067, 0.02103) (0.02709, 0.07152) (0.0165, 0.0546)
Kappa 0.9748 0.9916 0.9299 0.9463
AUC 1.0 1.0 0.99 0.99
Table 6: Performance summary of proposed method with and without WB+CLAHE for 2-class vs 3-class problem

Figure  7 demonstrates the confusion matrix based on the classification result of the model’s prediction for binary class and multi-class problems. In a multi-class problem, the model predicted 4 False positives out of which one belongs to COVID-19, and three belongs to viral pneumonia. Due to overlapping patterns exist between COVID-19 and viral pneumonia, the model predicted 3 COVID-19 cases as viral pneumonia, whereas one viral pneumonia case predicted as COVID-19. Further, in the case of viral pneumonia, the model predicted 5 False negatives in which five viral pneumonia cases predicted as normal while 1 case predicted as COVID-19. In the case of COVID-19, the model predicted three false negatives, and all the three COVID cases predicted as viral pneumonia. In binary classification, the model has predicted only 1 False negative, which showed that the model is highly sensitive, precise, and specific in predicting COVID cases.

Figure 7: Confusion matrix of proposed method for a) 2-class b) 3-class

In Fig. 8, we presented the ROC curve for both binary classification and multi-class classification problem. In multi-class problem the average area under the curve (AUC) for class 0 i.e., normal cases is 0.99, for class 1 i.e., COVID-19 class is 1.0 and for class 2 i.e., viral pneumonia is 0.99. However, in case of binary classification the average area under the curve (AUC) for both COVID-19 and normal class is 1.0.

Figure 8: ROC curve of proposed method COVIDLite a) 2-class b) 3-class problem

Further, we have shown the LIME  Ribeiro et al. (2016) maps, saliency maps and Grad-CAM  Selvaraju et al. (2017) visualization of misclassified instances of COVIDLite without pre-processing for viral and COVID-19 patients in Figure 9 and correctly classified instances of COVIDLite with pre-processing (White Balance + CLAHE) in Figure 10. For detecting COVID-19 from CXR images the most common findings include lung consolidation and ground glass opacities  Jacobi et al. (2020) . Apart from that, COVID-19 and other viral pneumonia typically cause lung opacities in more than one lobe  Jacobi et al. (2020) . Earlier researchers noted that patients suffering from COVID-19 include air-space disease which involve bilateral lower lung distribution  Wong et al. (2020) .

As shown in Figure 9, without pre-processing model considered noisy features for prediction of viral and COVID-19 induced pneumonia and wrongly predicted both the cases as normal. Figure 9 1a) and Fig. 9 2a) demonstrates the LIME map of the model’s prediction outlining localized regions in CXR images while Fig.9 1c) and Fig. 9 2c) demonstrate the Grad-CAM heatmap highlighting the upper region of the CXR. The saliency map highlighting the lower and upper right corner of CXR in Fig.9 1b) and highlighting the upper right corner of CXR in Fig.9 2b) instead of focusing on ground glass opacity and pneumonia involving the lower lung zones bilaterally on CXR in COVID-19 case and on diffuse interstitial pattern showing lung inflammation in right lung in viral pneumonia case. Thus, the model without pre-processing was unable to detect core features for classifying viral and COVID-19 pneumonia resulted in incorrect prediction whereas model with pre-processing, i.e., White balance followed by CLAHE correctly highlighted region affected with pneumonia, i.e., left lung inflammation as shown in Figure Fig. 10 1a), Fig.  10 1b) and Fig. 10 1c) and accurately highlighted peripheral left mid to lower lung opacities and ground glass opacity (generally visible in chest CT) in COVID-19 case as shown in Fig.10 2a), Fig. 10 2b) and Fig.10 2c) respectively.

Figure 9: Visualization of incorrect predictions of proposed method without (White balance + CLAHE). Images 1 and 2 show the original CXRs of viral and COVID-19, images 1.a) and 1.b) show the LIME maps, images 1.b) and 2.b) show the saliency maps and images 1.c) and 2.c) show the grad-CAM heatmap of viral and COVID-19 patients respectively.
Figure 10: Visualization of correct predictions of proposed method with (White balance + CLAHE). Images 1 and 2 show the original CXRs of viral and COVID-19, images 1.a) and 1.b) show the LIME maps, images 1.b) and 2.b) show the saliency maps and images 1.c) and 2.c) show the grad-CAM heatmap of viral and COVID-19 patients respectively.

5 Discussion

This section illustrates the comparison of proposed method with recent state-of-the-art methods applied for detection of COVID-19 using radiology images i.e., CXR and chest CT. In most of the studies researchers used Dr Cohen’s image collection dataset  P et al. (2020)

. This was the first open source dataset available for researchers to extract useful patterns from CXR and chest CT images for detection of COVID-19.  

Apostolopoulos and Mpesiana (2020) proposed transfer learning based pre-trained network VGG19 for detecting COVID-19 from CXR images. In this study researchers have used 224 CXR images of COVID-19, 700 images of viral pneumonia and 504 images of normal or healthy cases. Their model achieved an accuracy of 93.48% for 3-class problem.  Wang and Wong (2020) used light-weight deep learning architecture named as COVID-Net. They have used 13,975 CXR images across 13,870 patients and achieved a test accuracy of 93.3%. In this study researchers have employed projection-expansion-projection-extension (PEPX) design pattern comprising 1 x1 convolution at beginning and end of architecture while depth-wise convolutions at the middle. Xu et al.  Xu et al. (2020) proposed combination of pre-trained network ResNet and Location Attention method. They have achieved an accuracy of 86.7% with 219 COVID-19, 224 viral pneumonia and 175 normal chest CT images.  Ozturk et al. (2020) developed deep CNN model named DarkCovidNet for detection of COVID-19 from CXR images. They have used DarkNet-19  Redmon and Farhadi (2016) architecture for building the model and achieved an accuracy of 87.02% for multi-class and 98.08% for 2-class problem.  Abbas et al. proposed class decomposition mechanism named DeTraC for investigating class boundaries with ResNet18 pre-trained network for detection of COVID-19 from 196 CXR images. Further they have used PCA for feature space dimensionality reduction resulted in achieving higher accuracy of 95.12%.  Luz et al. (2020) proposed two variants of EfficientNet deep learning model i.e., hierarchical and Flat. In hierarchical model researchers used one classifier at root node of the tree for classifying Normal and viral pneumonia cases and the second classifier at higher level to further classify COVID-19 and viral Pneumonia cases. This approach resulted in lower accuracy of 93.50% in comparison to Flat model which attain overall test accuracy of 93.9%. Both flat efficient net B0 and hierarchical efficientnetB3 are larger in terms of size and total parameters having total parameters 5,330,564 (21 MB) and 12,320,528 (48 MB) respectively.

In case of binary classification problem, researchers have used different combination of transfer learning approaches.  Narin et al. (2020) employed three pre-trained networks (ResNet50, InceptionV3, and InceptionResNetV2) in which ResNet50 outperformed other variants by achieving a higher accuracy of 98%. In this study, researchers used 50 CXR images of COVID-19 positive patients and 50 images of COVID-19 negative images.  Sethy and Behera (2020) used ResNet50 in combination with support vector machines (SVM) for extracting deep features. However, in this study researchers have used a very small dataset of 50 CXR images comprising 25 COVID-19 images and 25 COVID-19 negative images.  Song et al. (2020) employed a combination of Attention and Feature Pyramid Network (FPN) to extract top K features of chest CT images using deep ResNet pre-trained network. In this study, they have achieved an accuracy of 86% with 777 COVID-19 images and 708 healthy images.  Wang et al. (2020) proposed a modified Inception network (M-Inception) to detect COVID-19 from chest CT images with an accuracy of 82.90%.  Zheng et al. (2020) developed a 3 dimensional CNN model with pre-trained network U-Net using 313 chest CT images of COVID-19 and 229 images of healthy cases. They have reported an accuracy of 90.8%.  Rahimzadeh and Attar (2020) concatenated pre-trained network Xception and ResNet50 V2 for detection of COVID-19 and viral pneumonia using 15085 CXR images. As they have combined two complex architecture for the task the resulting model was computationally intensive with model size of 560 MB which makes it unfavorable in terms of practical implementation for real time predictions. Finally,  Khan et al. (2020) proposed a deep CNN model named CoroNet which uses local receptive field and sharing of weight for better performance. They have attained an overall test accuracy of 95% for 3-class classification task. Their proposed CNN model was significantly deeper having large number of trainable parameters i.e., 33,915,436 due to which their model size also increased making their method unfavorable for practical implementation in terms of mobile based web app for real time predictions.

In this research, we have developed a deep CNN model for the detection of COVID-19 induced pneumonia using 1823 CXR images. The proposed method named COVIDLite is significantly lighter, with 1019330 total trainable parameters and 8.4 MB in size, making it favorable to be integrated with mobile devices for real-time predictions. Our method has attained an accuracy of 99.58% for binary classification and 96.43% for multi-class classification. In this research, we have found that our proposed COVIDLite method outperformed recent state-of-the-art methods employed for the detection of COVID-19 from radiology images in terms of accuracy, as shown in Table 7.

Method Image Type Image distribution Accuracy (%)
VGG19 Apostolopoulos and Mpesiana (2020) CXR
224 COVID-19,
700 Pneumonia,
504 Healthy
93.48
COVID-Net Wang and Wong (2020) CXR
53 COVID-19(+),
5526 COVID-19(-),
8066 Healthy
92.40
ResNet + Local Attention Xu et al. (2020) CT
219 COVID-19,
224 Pneumonia,
75 Healthy
86.70
Dark Covid Net Ozturk et al. (2020) CXR
125 COVID-19,
500 Pneumonia,
500 No-Findings
87.02
DeTrac ResNet18 Abbas et al. CXR
105 COVID-19,
11 SARS,
80 Normal
95.12
Flat-Efficient Net Luz et al. (2020) CXR
8066 Normal,
5521 Pneumonia,
183 COVID-19
93.93
Hierarcical Efficient Net B3 Luz et al. (2020) CXR
8066 Normal,
5521 Pneumonia,
183 COVID-19
93.50
Deep CNN ResNet-50 Narin et al. (2020) CXR
50 COVID-19(+),
50 COVID-19(-)
98.00
DRE-Net Song et al. (2020) CT
777 COVID-19(+),
708 Healthy
86.00
M-Inception Wang et al. (2020) CT
195 COVID-19(+),
258 COVID-19(-)
82.90
U Net + 3D Deep Network Zheng et al. (2020) CT
313 COVID-19(+),
229 COVID-19(-)
90.80
Xception + ResNet50V2 Rahimzadeh and Attar (2020) CXR
180 COVID-19 (+),
6054 Pneumonia,
8851 Normal
91.40
CoroNet Khan et al. (2020) CXR
310 Normal,
327 Pneumonia,
284 COVID-19
95.00
Proposed Method CXR
536 COVID-19,
619 Viral pneumonia,
668 Normal
96.43 (3 class)
99.58 (2 class)
Table 7: Comparison of proposed method with recent state-of-the-art methods for COVID-19 detection using radiology images

6 Conclusion

In this paper, we demonstrated an effective deep learning model for detecting COVID-19 from CXR images with higher accuracy. Our proposed method performs both binary and multi-class classification with an accuracy of 99.58% and 96.43%, respectively. The proposed method captures low-level feature maps by enhancing the visibility of CXR images with advanced preprocessing techniques, which facilitates recognizing intricate patterns from medical images at a level comparable with experienced radiologists. As future work, we will further enhance our method’s performance by including the lateral view of CXR images in our training data, as in some of the cases, frontal view of CXR images does not give a clear idea in diagnosing pneumonia cases. Further, similar proposed preprocessing with the DSCNN method can be extenedd to detect more critical diseases such as lung cancer, fibrosis, tuberculosis, and pnemo-thorax, which in turn can assist the radiologists in the remotest parts of the world, where medical resources are limited.

References

  • [1] A. Abbas, M. Abdelsamea, and M. Gaber Classification of covid-19 in chest x-ray images using detrac deep convolutional neural network. medRxiv. Cited by: Table 7, §5.
  • S. P. Adhikari, S. Meng, Y. Wu, Y. Mao, R. Ye, Q. Wang, C. Sun, S. Sylvia, S. Rozelle, H. Raat, et al. (2020) Epidemiology, causes, clinical manifestation and diagnosis, prevention and control of coronavirus disease (covid-19) during the early outbreak period: a scoping review. Infectious diseases of poverty 9 (1), pp. 1–12. Cited by: §1.
  • I. D. Apostolopoulos and T. A. Mpesiana (2020) Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks. Physical and Engineering Sciences in Medicine, pp. 1. Cited by: Table 7, §5.
  • Z. Chen (2020) Cited by: §3.1.
  • P. Chhikara, P. Singh, P. Gupta, and T. Bhatia (2020) Deep convolutional neural network with transfer learning for detecting pneumonia on chest x-rays. In Advances in Bioinformatics, Multimedia, and Electronics Circuits and Signals, pp. 155–168. Cited by: §1.
  • F. Chollet (2017) Xception: deep learning with depthwise separable convolutions. In

    Proceedings of the IEEE conference on computer vision and pattern recognition

    ,
    pp. 1251–1258. Cited by: §2.1, §2.3.
  • J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei (2009) Imagenet: a large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Cited by: §2.3.
  • D. S. et.al (2020) Cited by: §3.1.
  • S. Hazra and A. Santra (2018) Robust gesture recognition using millimetric-wave radar system. IEEE sensors letters 2 (4), pp. 1–4. Cited by: §1.
  • E. E. Hemdan, M. A. Shouman, and M. E. Karar (2020) Covidx-net: a framework of deep learning classifiers to diagnose covid-19 in x-ray images. arXiv preprint arXiv:2003.11055. Cited by: §1.
  • S. Ioffe and C. Szegedy (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167. Cited by: §2.3.
  • A. Jacobi, M. Chung, A. Bernheim, and C. Eber (2020) Portable chest x-ray in coronavirus disease-19 (covid-19): a pictorial review. Clinical Imaging 64, pp. . External Links: Document Cited by: §4.2.
  • J. P. Kanne, B. P. Little, J. H. Chung, B. M. Elicker, and L. H. Ketai (2020) Essentials for radiologists on covid-19: an update—radiology scientific expert panel. Radiological Society of North America. Cited by: §1.
  • D. Kermany, K. Zhang, and M. Goldbaum (2018) Large dataset of labeled optical coherence tomography (oct) and chest x-ray images. Mendeley Data, v3 http://dx. doi. org/10.17632/rscbjbr9sj 3. Cited by: §3.1.
  • A. I. Khan, J. L. Shah, and M. Bhat (2020) Coronet: a deep neural network for detection and diagnosis of covid-19 from chest x-ray images. arXiv preprint arXiv:2004.04931. Cited by: Table 7, §5.
  • W. Kong and P. P. Agarwal (2020) Chest imaging appearance of covid-19 infection. Radiology: Cardiothoracic Imaging 2 (1), pp. e200028. Cited by: §1.
  • E. Y. Lee, M. Ng, and P. Khong (2020) COVID-19 pneumonia: what has ct taught us?. The Lancet Infectious Diseases 20 (4), pp. 384–385. Cited by: §1.
  • Y. Li and L. Xia (2020) Coronavirus disease 2019 (covid-19): role of chest ct in diagnosis and management. American Journal of Roentgenology, pp. 1–7. Cited by: §1.
  • M. Loey, F. Smarandache, and N. E. M Khalifa (2020) Within the lack of chest covid-19 x-ray dataset: a novel detection model based on gan and deep transfer learning. Symmetry 12 (4), pp. 651. Cited by: §1.
  • E. Luz, P. L. Silva, R. Silva, and G. Moreira (2020) Towards an efficient deep learning model for covid-19 patterns detection in x-ray images. arXiv preprint arXiv:2004.05717. Cited by: Table 7, §5.
  • A. Narin, C. Kaya, and Z. Pamuk (2020) Automatic detection of coronavirus disease (covid-19) using x-ray images and deep convolutional neural networks. arXiv preprint arXiv:2003.10849. Cited by: Table 7, §5.
  • T. Ozturk, M. Talo, E. A. Yildirim, U. B. Baloglu, O. Yildirim, and U. R. Acharya (2020) Automated detection of covid-19 cases using deep neural networks with x-ray images. Computers in Biology and Medicine, pp. 103792. Cited by: Table 7, §5.
  • C. J. P, P. Morrison, and D. L (2020) COVID-19 image data collection, arxiv. arXiv preprint arXiv:2003.11597. External Links: Link Cited by: §3.1, §5.
  • J. W. Picone (1993) Signal modeling techniques in speech recognition. Proceedings of the IEEE 81 (9), pp. 1215–1247. Cited by: §1.
  • S. M. Pizer, E. P. Amburn, J. D. Austin, R. Cromartie, A. Geselowitz, T. Greer, B. ter Haar Romeny, J. B. Zimmerman, and K. Zuiderveld (1987) Adaptive histogram equalization and its variations. Computer vision, graphics, and image processing 39 (3), pp. 355–368. Cited by: §2.2.
  • S. M. Pizer (1990) Contrast-limited adaptive histogram equalization: speed and effectiveness stephen m. pizer, r. eugene johnston, james p. ericksen, bonnie c. yankaskas, keith e. muller medical image display research group. In Proceedings of the First Conference on Visualization in Biomedical Computing, Atlanta, Georgia, May 22-25, 1990, pp. 337. Cited by: §2.2.
  • M. Rahimzadeh and A. Attar (2020) A modified deep convolutional neural network for detecting covid-19 and pneumonia from chest x-ray images based on the concatenation of xception and resnet50v2. Informatics in Medicine Unlocked, pp. 100360. External Links: Document Cited by: Table 7, §5.
  • J. Redmon and A. Farhadi (2016) YOLO9000: better, faster, stronger. arXivarXiv preprint arXiv:1612.08242. Cited by: §5.
  • M. T. Ribeiro, S. Singh, and C. Guestrin (2016) ” Why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135–1144. Cited by: §4.2.
  • R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pp. 618–626. Cited by: §4.2.
  • P. K. Sethy and S. K. Behera (2020) Detection of coronavirus disease (covid-19) based on deep features. Preprints 2020030300, pp. 2020. Cited by: §1, §5.
  • H. Shi, X. Han, N. Jiang, Y. Cao, O. Alwalid, J. Gu, Y. Fan, and C. Zheng (2020) Radiological findings from 81 patients with covid-19 pneumonia in wuhan, china: a descriptive study. The Lancet Infectious Diseases. Cited by: §1.
  • Y. Song, S. Zheng, L. Li, X. Zhang, X. Zhang, Z. Huang, J. Chen, H. Zhao, Y. Jie, R. Wang, et al. (2020) Deep learning enables accurate diagnosis of novel coronavirus (covid-19) with ct images. medRxiv. Cited by: Table 7, §5.
  • N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov (2014) Dropout: a simple way to prevent neural networks from overfitting.

    The journal of machine learning research

    15 (1), pp. 1929–1958.
    Cited by: §2.3.
  • O. Stephen, M. Sain, U. J. Maduh, and D. Jeong (2019) An efficient deep learning approach to pneumonia classification in healthcare. Journal of healthcare engineering 2019. Cited by: §1.
  • C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna (2016) Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826. Cited by: §2.3.
  • K. Tang, R. Wang, and T. Chen (2011) Towards maximizing the area under the roc curve for multi-class classification problems. In

    Twenty-Fifth AAAI Conference on Artificial Intelligence

    ,
    Cited by: §3.2.
  • B. Udugama, P. Kadhiresan, H. N. Kozlowski, A. Malekjahani, M. Osborne, V. Y. Li, H. Chen, S. Mubareka, J. B. Gubbay, and W. C. Chan (2020) Diagnosing covid-19: the disease and tools for detection. ACS nano 14 (4), pp. 3822–3835. Cited by: §1.
  • L. Wang and A. Wong (2020) COVID-net: a tailored deep convolutional neural network design for detection of covid-19 cases from chest radiography images. arXiv, pp. arXiv–2003. Cited by: Table 7, §5.
  • S. Wang, B. Kang, J. Ma, X. Zeng, M. Xiao, J. Guo, M. Cai, J. Yang, Y. Li, X. Meng, et al. (2020) A deep learning algorithm using ct images to screen for corona virus disease (covid-19). MedRxiv. Cited by: Table 7, §5.
  • [41] (2017) White Balance Algorithm with Gdal in C#. External Links: Link Cited by: §2.2.
  • [42] (2020) White Balance. External Links: Link Cited by: §2.2.
  • H. Y. F. Wong, H. Y. S. Lam, A. H. Fong, S. T. Leung, T. W. Chin, C. S. Y. Lo, M. M. Lui, J. C. Y. Lee, K. W. Chiu, T. Chung, et al. (2020) Frequency and distribution of chest radiographic findings in covid-19 positive patients. Radiology, pp. 201160. Cited by: §4.2.
  • S. Wong, Y. Yu, N. A. Ho, and R. Paramesran (2014) Comparative analysis of underwater image enhancement methods in different color spaces. In 2014 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 034–038. Cited by: §2.2.
  • X. Xie, Z. Zhong, W. Zhao, C. Zheng, F. Wang, and J. Liu (2020) Chest ct for typical 2019-ncov pneumonia: relationship to negative rt-pcr testing. Radiology, pp. 200343. Cited by: §1.
  • X. Xu, X. Jiang, C. Ma, P. Du, X. Li, S. Lv, L. Yu, Y. Chen, J. Su, and G. Lang (2020) Others,” deep learning system to screen coronavirus disease 2019 pneumonia,”. arXiv preprint arXiv:2002.09334. Cited by: Table 7, §5.
  • Y. Yi, P. N. Lagniton, S. Ye, E. Li, and R. Xu (2020) COVID-19: what has been learned and to be learned about the novel coronavirus disease. International journal of biological sciences 16 (10), pp. 1753. Cited by: §1.
  • S. H. Yoon, K. H. Lee, J. Y. Kim, Y. K. Lee, H. Ko, K. H. Kim, C. M. Park, and Y. Kim (2020) Chest radiographic and ct findings of the 2019 novel coronavirus disease (covid-19): analysis of nine patients treated in korea. Korean journal of radiology 21 (4), pp. 494–500. Cited by: §1.
  • W. Zhao, Z. Zhong, X. Xie, Q. Yu, and J. Liu (2020) Relation between chest ct findings and clinical conditions of coronavirus disease (covid-19) pneumonia: a multicenter study. American Journal of Roentgenology 214, pp. 1–6. External Links: Document Cited by: §1.
  • C. Zheng, X. Deng, Q. Fu, Q. Zhou, J. Feng, H. Ma, W. Liu, and X. Wang (2020) Deep learning-based detection for covid-19 from chest ct using weak label. medRxiv. Cited by: Table 7, §5.