Robust Screening of COVID-19 from Chest X-ray via Discriminative Cost-Sensitive Learning

04/27/2020 ∙ by Tianyang Li, et al. ∙ 0

This paper addresses the new problem of automated screening of coronavirus disease 2019 (COVID-19) based on chest X-rays, which is urgently demanded toward fast stopping the pandemic. However, robust and accurate screening of COVID-19 from chest X-rays is still a globally recognized challenge because of two bottlenecks: 1) imaging features of COVID-19 share some similarities with other pneumonia on chest X-rays, and 2) the misdiagnosis rate of COVID-19 is very high, and the misdiagnosis cost is expensive. While a few pioneering works have made much progress, they underestimate both crucial bottlenecks. In this paper, we report our solution, discriminative cost-sensitive learning (DCSL), which should be the choice if the clinical needs the assisted screening of COVID-19 from chest X-rays. DCSL combines both advantages from fine-grained classification and cost-sensitive learning. Firstly, DCSL develops a conditional center loss that learns deep discriminative representation. Secondly, DCSL establishes score-level cost-sensitive learning that can adaptively enlarge the cost of misclassifying COVID-19 examples into other classes. DCSL is so flexible that it can apply in any deep neural network. We collected a large-scale multi-class dataset comprised of 2,239 chest X-ray examples: 239 examples from confirmed COVID-19 cases, 1,000 examples with confirmed bacterial or viral pneumonia cases, and 1,000 examples of healthy people. Extensive experiments on the three-class classification show that our algorithm remarkably outperforms state-of-the-art algorithms. It achieves an accuracy of 97.01 F1-score of 96.98 the fast large-scale screening of COVID-19.



There are no comments yet.


page 1

page 4

page 6

page 7

page 8

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

As COVID-19 continues to affect our world, automated screening of coronavirus disease 2019 (COVID-19) is urgently needed to realize large-scale screening to combat it. Since the outbreak of COVID-19 in December 2019 to date, more than 2,804,796 people have been infected around the world. And more than 193,722 deaths from the virus have been recorded according to the World Health Organization. Fast and large-scale screening is necessary to cut off the source of infection. However, the rapidly growing amount of COVID-19 cases makes global medical resources unbearable. Automated screening systems can correspondingly assist in speeding up screening and would reduce the workload of radiologists. Therefore, it is urgent to realize the automated screening of COVID-19 that helps to accelerate the large-scale screening and alleviate the global shortage of medical resources.

Nowadays, medical imaging examinations, such as chest CT, X-ray, play an essential role in the diagnosis process of COVID-19. Clinically, although nucleic acid detection is the gold standard, the availability, stability, and reproducibility of nucleic acid detection kits are problematic [46]. For example, the quantity of nucleic acid kits is limited in many countries or regions, resulting in the slower screening of new coronary pneumonia [53]. Many patients with new coronary pneumonia cannot be tested in time and thus cannot be admitted to the hospital, which accelerates the widespread of the novel virus. On the contrary, medical imaging examinations can help clinical to carry out disease detection conveniently and quickly, and thus make patients be treated timely. Accordingly, medical imaging examinations with symptom observation are widely used for the early diagnosis of COVID-19 worldwide [24].

X-rays have unique advantages of light, quick, and availability in the screening of COVID-19. On the one hand, regular X-ray machines can be accessed in most primary hospitals where CT scanners are insufficient. Most ambulatory care facilities have deployed X-ray units as basic diagnostic imaging. Most importantly, the easily accessible X-ray examination is beneficial for fast large-scale screening. On the other hand, the radiation dose of X-ray is a few hundredths of the chest CT [12, 21]. One statistic from the San Francisco bay area shows that hospitals add one more case of cancer for every 400 to 2,000 additional routine chest CT examinations [41]. This statistic indicates that a single chest CT examination increases the lifetime risk of cancer by 0.05% - 0.25%. Also, the X-ray examination is more economical than CT.

Therefore, this paper addresses the novel task of automated screening of COVID-19 from chest X-rays. However, robust and accurate screening has two crucial bottlenecks. Firstly, the imaging features of COVID-19 cases share some similarities with other pneumonia cases on chest X-rays. Even radiologists cannot distinguish them based on chest CT accurately without other inspection methods [46], let alone based on chest X-rays. The second bottleneck is that the misdiagnosis rate of COVID-19 is very high. High misdiagnosis rate of COVID-19 has a prohibitive cost that is not only delaying the timely treatment of patients but also causing the widespread of the virus with high social costs. Misdiagnosis cost sensitivity should be fully considered in the automated screening of COVID-19. Therefore, these limitations impedes the accurate screening of COVID-19 patients from the susceptible population.

While a few pioneering works have made much progress, they neglect both crucial bottlenecks. All of them adopted common machine learning classifiers. For example, both Hassanien

et al. [17] and Sethy et al. [35]

used support vector machines (SVM) to screen COVID-19 based on extracted features. Farooq

et al. [11], Narin et al. [32], Wang et al. [45] and etc adopted popular deep neural networks with a minor modification of network architecture. More interestingly, Zhang et al. [53]

viewed this screening task as an anomaly detection problem and proposed to use existing anomaly detection techniques. However, in fact, the screening of COVID-19 is a fine-grained cost-sensitive classification problem, as mentioned before. Only from this perspective can we design a satisfactory solution. Therefore, in this study, we attempt to design an efficient solution to combat the bottlenecks.

In this paper, we propose an innovative discriminative cost-sensitive learning (DCSL) for the robust screening of COVID-19 from chest X-rays. DCSL combines both advances from fine-grained classification and cost-sensitive learning techniques. To overcome the subtle difference bottleneck, we propose a conditional center loss function to learn the discriminative representation between fine-grained classes. By combing with vanilla loss function, the conditional center loss can discover a weighted center for each class and efficiently enlarge the inter-class manifold distance as well as enhancing the intra-class compactness in deep representation space. To combat the cost sensitivity, we propose score-level cost-sensitive learning. It introduces a score-level cost matrix to reshape the classifier confidences by modifying the classifier output, such that the COVID-19 examples have the maximum score, and the other classes have a substantially lower score. Based on the domain knowledge that the costs between misclassifying COVID-19 into different classes or misclassifying other classes into COVID-19 are not equal, we define new score-level costs to encourage the correct classification of COVID-19 class. We combine both advances into a deep neural network with end-to-end optimization, successfully achieving fine-grained cost-sensitive screening of COVID-19. A series of experiments show that our algorithm remarkably outperforms previous methods.

The contributions of this work include:

  • For the first time, we formulate the task of screening of COVID-19 from chest X-ray as a fine-grained cost-sensitive classification problem. Accordingly, we propose a practical solution, discriminative cost-sensitive learning, that achieves much high screening accuracy.

  • We propose a new conditional center loss that considers the class-conditional information when learning the center points per class. The conditional center loss successfully overcomes the bottleneck of feature similarities.

  • We propose a new score-level cost-sensitive learning that introduces a domain knowledge-based cost matrix to enlarge the cost of misclassifying COVID-19 examples into other classes. It greatly reduces the misdiagnosis rate.

The remainder of this paper is organized as follows. Section 2 presents the related works in terms of artificial intelligence assisted analysis of COVID-19 and involved methodologies. Section 3 gives in detail the proposed discriminative cost-sensitive learning. Section 4 presents detailed descriptions of collected datasets, experiment settings, and exhaustive results. Section 5 concludes this work comprehensively.

2 Related Works

This section shows related works about the automated screening of COVID-19 and involved algorithms of our work.

2.1 Automated Medical Image Analysis about COVID-19

To take part in the global fight against COVID-19, many studies have designed AI-empowered technologies for improving the clinical diagnosis efficiency. Shi et al. [37] comprehensively summarized lots of emerging works, including automated screening [46, 50, 38, 23, 40, 14, 15, 22, 55], patient severity assessment [20], infection quantification [36], and infection area segmentation [6, 14]. Among them, automated screening received the most attention, involving chest X-ray based and chest CT based works. Since 3D CT scans have spatial complexity, existing CT based works have proposed to design three types of solutions, including patch-based methods [46, 50, 38, 23], slice-based methods [40, 14, 15, 22], and 3D CT-based method [55]. As we mentioned before, CT has some disadvantages of expensive, high radiation, and inaccessibility, thus lots of screening works are based on 2D chest X-rays.

We comprehensively review chest X-rays based methods as follows. Hassanien et al. [17] used a multi-level threshold segmentation algorithm to crop lung areas and adopted SVM to classify COVID-19 and normal cases based on 40 chest X-rays. Ozturk et al. [33]

ensembled several feature extraction algorithms and used a stacked autoencoder with principal component analysis to make decisions. They showed that handcrafted features based classifiers perform better than deep models on small data. Several studies applied popular deep learning techniques for the screening of COVID-19. Hemdan

et al. [19] validated multiple popular deep models and demonstrated their effectiveness on this new task. Farooq et al. [11] utilized the residual networks (ResNet) to validate the screening performance of COVID-19. Narin et al. [32] tested an inception architecture (Inceptionv3) on the screening task. They showed that pre-trained models are useful. The VGG16 and VGG19 networks are adopted by Hall et al. [16] and Apostolopoulos et al. [3], respectively. Khalifa et al. [26]

used generative adversarial networks and a fine-tuned deep transfer learning model, achieving promising and effective performance. Their results also confirm that chest X-rays based screening of COVID-19 has great research significance.

Moreover, several studies designed specialized solutions for the screening of COVID-19 according to the characteristics of the task. Afshar et al. [2] adopted a capsule network for handling small data. Abbas et al. [1]

designed a Decompose, Transfer, and Compose (DeTraC) network based on class decomposition for enhancing low variance classifiers and facilitating more flexibility to their decision boundaries. Wang

et al. [46] proposed a new deep network called COVID-Net, which consists of stacked residual blocks for achieving easily training and deepening the architectures. To improve diagnostic performance, Ghosha et al. [13]

used Bayesian convolutional neural networks to estimate uncertainty. More interestingly, Zhang

et al. [53] viewed this screening task as an anomaly detection problem and proposed to use existing anomaly detection techniques. Specifically, they used a hybrid loss that combines a binary cross-entropy loss and a deviation loss to assign anomaly scores to COVID-19 examples. While a few pioneering works have made great progress, they neglect both cost sensitivity and fine-grained bottlenecks. To the best of our knowledge, it is the first time that we insightfully view the screening of COVID-19 from chest X-rays as a fine-grained cost-sensitive classification problem.

2.2 Involved Methods of Our Work

2.2.1 Fine-Grained Classification

The goal of fine-grained classification is to classify data belonging to multiple subordinate categories, e.g., COVID-19, common pneumonia, normal chest X-rays. The facing problem of fine-grained classification is that these subordinate categories naturally exist small inter-class variations and large intra-class variations. The common solutions of this problem can be organized into three main paradigms, including 1) localization-classification networks based methods [54, 28, 47], 2) external information-based methods [42, 49, 8], and 3) end-to-end feature coding-based methods [29, 10]. Localization-classification networks based methods first learn part-based detectors or segmentation model to localize salient parts for improving the final recognition accuracy. However, this type of paradigm needs additional part annotations. External information-based methods leverage external information, i.e., web data [42], multi-modality [49], or human-computer interactions [8].

Different from the previous two paradigms, end-to-end feature coding-based methods learn a more discriminative feature representation directly. Among them, several works specifically designed extra useful loss functions for learning discriminative fine-grained representations. For instance, the contrastive loss is designed for dealing with the relationship of paired example points effectively [43]. Triplet loss correspondingly constructs loss functions for example triplet [34]

. However, contrastive loss and triplet loss are required that the number of training pairs or triplets dramatically grows, with slow convergence and instability. Center loss is proposed to solve this issue by minimizing the deep feature distances of intra-class only 

[48]. This loss function learns a center for each class and pulls the deep features of the same class to their centers efficiently. However, center loss easily suffers from the issue of class imbalance. Therefore, in this paper, we propose to learn a new conditional center loss with joint balance optimization for the robust screening of COVID-19.

2.2.2 Cost-Sensitive Learning

Cost-sensitive learning is a learning method that considers the cost of misclassification, and its purpose is to minimize the total cost [30]. In the classical machine learning setting, the costs of classification errors of different classes are equal. Unfortunately, the costs are not equal in many real-world tasks. For example, in COVID-19 screening, the cost of erroneously diagnosing a COVID-19 patient to be healthy may be much higher than that of mistakenly diagnosing a COVID-19 patient to be common pneumonia. Cost-sensitive learning is proposed to handle this problem and has attracted much attention from the machine learning and data mining communities [56]. Existing works on misclassification costs can be categorized into two classes, including example-dependent cost [51, 52, 4] and class-dependent cost [5, 9, 31]. Example-dependent cost-based methods consider the misclassification cost of each example and are required example-level annotations, which are impractical in real-world tasks. Therefore, most methods are focused on class-dependent costs. Cost-sensitive learning is also suitable to handle the problem of class imbalance [27]. In this study, we introduce a score-level cost-sensitive learning approach based on an expert-provided cost matrix to improve the screening accuracy of COVID-19 from chest X-rays.

Figure 1: Overview of the proposed discriminative cost-sensitive learning (DCSL) framework. Based on a data pool of clinical X-rays, a comprehensive analysis is conducted to obtain the class-conditional information (class balance weight). In the training period, we randomly draw a batch of data to optimizer softmax cross-entropy and conditional center loss with the class-conditional information. Specifically, an input image is firstly processed by a backbone network, which mainly includes convolutional layers (Conv) and fully-connected layers (FC). After that, deep features are obtained to minimize conditional center loss. Score-level costs are applied to the outputs from the final output layer to get the final cost-sensitive prediction.

3 Methodology

In this section, we first introduce the necessary notations and the objective for the task of screening of COVID-19 from chest X-rays (See Section 3.1). We then present the newly-proposed discriminative cost-sensitive learning (DCSL) framework, which consists of a conditional center loss (See Section 3.2) and a score-level cost-sensitive learning approach (See Section 3.3). We finally combine the two novel modules to construct the DCSL framework for the fine-grained cost-sensitive classification problem (See Section 3.4).

3.1 Problem Setting

The task of COVID-19 screening is under the familiar supervised learning setting where the learner receives a sample of

labeled training examples

drawn from a joint distribution

defined on , where is the example set of 2D chest X-ray images, and is the label set of patient conditions, such as COVID-19, common pneumonia, others. is in binary classification and in multi-class classification. Specifically, is any chest X-ray image of one patient, and is the label of this patient. We denote by the empirical distribution.

We denote by any loss function defined over pairs of labels, such as 0-1 loss, cross-entropy closs, etc. For binary classification, we denote by a scoring function, which induces a labeling function where . For any distribution on and any labeling function , we denote the expected risk. Our objective is to select a hypothesis out of a hypothesis set with a small expected risk on the target distribution.

3.2 Conditional Center Loss

This section presents the newly-proposed conditional center loss. To better understand the role of conditional center loss, we first consider the softmax based cross-entropy loss (softmax loss) that is presented as follows.


where (to reduce abuse notations) denotes the th deep feature of the X-ray image , belonging to the class . denotes the th column of the weights in the last fully-connected layer, and is the bias term. and

are the size of mini-batch and the number of classes, respectively. Intuitively, the softmax loss first computes the probability of correct classification and then takes the logarithm of this probability. Since the logarithm value of probability is negative, a minus sign is added in front of it. While the softmax loss is good at common object recognition tasks, it cannot learn enough discriminative features in processing fine-grained classification tasks in which significant intra-class variations exist.

According to our in-depth analysis of the characteristics of the screening of COVID-19 from chest X-rays, we, with keen insight, view it as a fine-grained classification problem. As we discussed before, the center loss has advantages of flexibility, easy-to-implement, and stability. Therefore, we initially leverage the center loss to learn more discriminative features. The goal of center loss is to directly improve the intra-class compactness, which is conducive to the discriminative feature learning. Center loss is widely embedded between the fully connected layers of deep neural networks for decreasing the intra-class variations in the representation space (the dimension is two). It is commonly appeared with softmax loss and used for face recognition and fine-grained classification. The center loss function is formulated as follows.


where the denotes the th class center of deep features. The center is updated according to the mini-batch data. And it is computed by averaging the features of the corresponding classes. A scalar is used to control the learning rate of the centers. The update equation of is represented as follows.


where is an indicator function in which if the condition is satisfied, and if not. denotes the iteration of training. The final joint loss function is given by


where is used for balancing the joint loss function.

While the joint loss function has achieved great success in practice, however, it quickly losses efficiency in the class imbalance situation. In other words, the center loss does not work in the screening task of COVID-19 according to our observations. After an in-depth analysis, we found that the problem is that the learned center points are unrepresentative. To handle this problem, we propose a conditional center loss that considers the class-conditional information when updating the center points and optimizing the center loss. We denote by the weight of th class, and is computed by the ratio between the number of th class’s training examples and the total training examples. The update equation of center points is reformulated by


Meanwhile, we found that embedding the clas-conditional information into the center loss can significantly improve the total screening accuracy. The reason is that it makes the center loss learn more balance center points and thus enhances the intra-class compactness to obtain discriminative deep features. Accordingly, the conditional center loss with softmax loss is represented as follows.


The conditional center loss can effectively handle the problem that imaging features of COVID-19 share some similarities with other pneumonia on chest X-rays by enlarging their feature distance in high-level representation space.

3.3 Score-Level Cost-Sensitive Learning

The goal of cost-sensitive learning is to classify examples from essential classes such as COVID-19 correctly. We propose a score-level cost-sensitive learning module that can efficiently learn robust feature representations for both the critical and common classes. It thus can enhance the accuracy of COVID-19 without unduly sacrificing the precision of the overall accuracy. Generally speaking, we introduce a handcrafted cost matrix whose design is based on clinical expert experience and then incorporate it after the output layer of deep neural networks. In this manner, we can directly modify the learning process to incorporate class-dependent costs during training and testing, without affecting the training and testing time of the original network. We will show that the proposed algorithm can efficiently work for the screening of COVID-19 and can be inserted into any deep neural networks. In this section, we first present the traditional cost-sensitive learning, which usually adds the cost matrix into the loss functions. We then detail the score-level cost-sensitive learning with an advanced learning strategy.

Formally, we denote by the cost matrix whose diagonal represents the benefit for a correct prediction. We also denote by the misclassification cost of classifying an example belonging to a class into a different class . The expected risk defined on the target distribution is given by



is the posterior probability over all possible classes given an example

. The goal of a classifier is to minimize the expected risk , however, which cannot be reached in practice. Thus, we use its empirical distribution of to minimize the empirical risk as follows.


where denotes a neural netowrk and is the neural netowrk output. denotes the one-hot of the label . For neural networks, loss function can be a cross-entropy loss function with softmax, which is penalized by the cost matrix as follows.


where denote the deep feature from the output of the penultimate layer. The entries of a handcrafted cost matrix usually have the form of


Inserting such a cost matrix into loss function can increase the corresponding loss value of an important class. However, such a manner would make the training process of neural networks unstable and can lead to non-convergence [27]. Therefore, we propose an alternative cost-sensitive learning.

(a) Normal (b) COVID-19 (c) Viral Pneumonia (d) Bacterial Pneumonia
Figure 2: The visualization of representative chest X-ray images from the collected dataset. They show high similarities.

To make the learning process more stable and convergence, we propose a new score-level cost matrix to modify the output of the last layer of a convolutional neural network (CNN). As shown in Figure 1, the location of the score-level cost matrix is after the output layer, and before the loss layer with softmax. We introduce the new score-level costs to encourage the correct classification of essential classes. Therefore, the CNN output is modified using the cost matrix as follows.


where and , such that . During the training process, the output weights are modified by the score-level cost matrix to reshape the classifier confidences such that the desired class has the maximum score, and the other classes have a considerably low score. Note that the score-level costs perturb the classifier confidences. Such perturbation allows the classifier to give more attention to the desired classes. In practice, all cost values in are positive, which enables a smooth training process. When using the score-level cost matrix, the cross-entropy loss function with softmax is finally can be revised by

0:    Training data ;Initialize parameters in backbone layers, parameters , in the output layer, and

;Hyperparameter class balance weight

, , and ;The learning rate ;The number of iteration .
0:  The parameters , , .
1:  while not converge do
3:     Obtain batch
4:     Obtain deep features
5:     Obtain outputs
6:     Compute the joint loss by
7:     Update
8:     Update
9:     Update
10:     Update
11:  end while
Algorithm 1 DCSL Algorithm.

3.4 Discriminative Cost-Sensitive Learning

Combining the conditional center loss and the score-level cost-sensitive learning, we propose the discriminative cost-sensitive learning (DCSL) framework. As shown in Figure 1, given a chest X-ray image , DCSL first uses a backbone network parameterized by to extract deep features which is used for finding the center points and optimizing the conditional center loss. DCSL then uses an output layer parameterized with and to obtain the output vector , where is a scoring function and is consisted of the backbone and the output layer. DCSL finally applies the score-level cost matrix on the output to obtain the new output

, which is inputted into the joint loss layer when training and into the softmax layer when testing. The joint loss is revised by


The workflow of learning and optimization of DCSL is shown in Algorithm 1.

4 Experiments

We evaluate our algorithm on a newly-collected dataset against state-of-the-art algorithms. The code and dataset will be publicly available.

4.1 Data and Set-up

To evaluate the performance of our method on screening of COVID-19, we collected a multi-class multi-center chest X-ray dataset. This dataset includes 2,239 examples with image-level labels. Specifically, we collected the dataset from three different sources. The first source is from a GitHub collection of chest X-rays of patients diagnosed with COVID-19111 The second source is from a Kaggle dataset222, which is thoroughly collected from the websites of 1) Radiological Society of North America (RSNA), 2) Radiopaedia, and 4) Italian Society of Medical and Interventional Radiology (SIRM). The third source is from a collection of X-ray images of bacterial and viral pneumonia [25]. The collected data consists of 239 chest X-rays with confirmed COVID-19, 1,000 chest X-rays with confirmed bacterial and viral pneumonia, and 1,000 examples of healthy condition. We selected out low-quality images in the dataset to prevent unnecessary classification errors. Representative Chest X-ray images of different classes are illustrated in Figure 2, which shows the subtle differences between different classes as well as proving the necessity of fine-grained classification.

In our experiments, we conduct a three-class classification task for better verifying the proposed SCSL algorithm in the screening task. The first class is healthy X-ray images, the second class is confirmed COVID-19 X-ray images, and the third class is other confirmed pneumonia X-ray images, which include both bacterial and viral pneumonia. We employ standard five-fold cross-validation on the dataset for performance evaluation and comparison. The dataset is divided into five groups. Among them, four groups are used for training the deep neural networks, and the last group is used for testing the performance. This procedure is repeated five times until the indices of all of the subjects are obtained. The evaluation metrics include accuracy, precision, sensitivity, and F1 score.

In order to verify the effectiveness of our proposed algorithm, we compare our designed DCSL algorithm with state-of-the-art methods: COVID-Net [45], VGG19 [39], Inceptionv3 [44], ResNet50 [18]. COVID-Net is a newly-proposed deep convolutional neural network tailored for the detection of COVID-19 from chest X-rays. VGG19, Inceptionv3, and ResNet50 are popular convolutional neural networks which have made great success in various tasks.

We implement our algorithm in Keras. We use the common VGG16 network as the backbone 

[39]. The original VGG16 network has 16 layers. There are 13 convolutional layers with a small filter with a size of

for extracting deep features. Five max-pooling layers with

kernel are deployed after each block of the convolutional layers. We set the output shape of the features of the last convolutional layer to be and flatten them. The original fully connected layers of VGG16 are removed and replaced by two trainable fully-connected layers. The channel numbers of the two fully-connected layers are 256 and three, respectively. Since the collected dataset is too small to obtain promising results through training the deep network from scratch, we use a transfer learning strategy, i.e., the parameters of the convolutional layers are initialized from the pre-trained model333

on ImageNet


. Also, all the compared algorithms are implemented according to their open-source codes with pretraining. The chest X-ray images were resized into

. We also use an augmentation strategy to expand the dataset: each random-selected example is rotated by 15 degrees clockwise or counterclockwise. The is set as 0.05.

is set to one without loss of generality. Adam optimizer is used with an initial learning rate of 1e-3. We set the training epoch to 40.

Figure 3: An illustration of the score-level cost matrix designed according to the clinical expert experience.

The score-level cost matrix is designed according to the clinical expert experience as following. First of all, the cost of misclassifying COVID-19 is higher than misclassifying other classes. Among them, the cost of misclassifying COVID-19 into healthy patients is higher than the cost of misclassifying COVID-19 into other pneumonia patients. Second, the cost of misclassifying other pneumonia is smaller than the cost of misclassifying COVID-19. Specifically, the cost of misclassifying other pneumonia into COVID-19 is higher than the cost of misclassifying other pneumonia into healthy patients. Third, the cost of misclassification of healthy people is smaller than in the previous two situations. Among them, the cost of misclassifying a healthy person into COVID-19 is greater than the cost of misclassifying a healthy person into other pneumonia. Accordingly, the final score-level cost matrix is designed as illustrated in Figure 3.

Method Accuracy Precision Sensitivity F1-score
VGG19 0.9196 0.9238 0.9196 0.9200
Inceptionv3 0.9107 0.9113 0.9107 0.9105
ResNet50 0.9107 0.9135 0.9107 0.9104
COVID-Net 0.9330 0.9339 0.9330 0.9332
DCSL (Ours) 0.9701 0.9700 0.9709 0.9698
Table 1: Classification results of multiple algorithms on three classes: normal, COVID-19, and other pneumonia.
(a) DCSL
(b) COVID-Net
Figure 4:

The confusion matrixes of DCSL and COVID-Net.

Figure 5: An illustration of the sensitivities values of each class.
(a) Softmax Loss (b) Softmax Loss + Center Loss (c) Softmax Loss + CCL (d) DCSL
Figure 6: The confusion matrixes of ablation studies from Softmax Loss, Softmax Loss + Center Loss, Softmax Loss + Conditional Center Loss (CCL), and DCSL. Our algorithm shows a higher and balance performance.
Method Accuracy Precision Sensitivity F1-score
Softmax Loss 0.9163 0.9173 0.9162 0.9158
Softmax Loss+CL 0.9208 0.9229 0.9207 0.9208
Softmax Loss+CCL 0.9721 0.9723 0.9721 0.9721
Table 2: This table shows the results of the ablation study of DCSL on the accuracy, precision, sensitivity, and F1-score.

4.2 Results

The proposed discriminative cost-sensitive learning algorithm (DCSL) achieves the highest results on the screening of COVID-19 from chest X-rays. Table 1 reports the results of our algorithm and compared algorithms. Our algorithm obtains a classification accuracy of 97.1%, a precision of 97.0%, a sensitivity of 97.09%, and an F1-score of 96.98%. Our algorithm remarkably outperforms COVID-Net [45], which achieves state-of-the-art results before our work. Also, our algorithm significantly outperforms all the compared algorithms on all the metrics. As shown in Figure 2, even both the complex lung structures and indiscernible infection areas lead to unusual difficulties; our algorithm still obtains accurate performance, which demonstrates its robust strengths.

Figure 4 displays the confusion matrixes of our algorithm and COVID-Net. Owing to our score-level cost-sensitive learning, we achieve 100% accuracy in the class of COVID-19. Such a result demonstrates the effectiveness of incorporating the score-level matrix after the output layer of deep neural networks to modify the learning process. Figure 5 presents the sensitivities of each class, where our algorithm achieves 100% sensitivity of COVID-19, which is much higher than compared methods. These results once verify the advantages of score-level cost-sensitive learning. Both Figure 4 and Figure 5 show that our algorithm also achieves the highest accuracy in other classes, which demonstrates the critical role of conditional center loss that can improve the intra-class compactness evenly.

We further perform statistical analysis to ensure that the experimental results have statistical significance. A paired t-test between the COVID-Net and our algorithm is at a 5% significance level with a p-value of 0.010. This analysis result clearly shows that the improvement of our method is noticeable. The p-values of the VGG19, Inceptionv3, and ResNet50 models are less than 0.05, which proves that popular classifiers are not suitable for the task of screening COVID-19 from chest X-rays. These analyses verify that our insight that viewing the screening of COVID-19 from chest X-rays as a fine-grained cost-sensitive classification task is correct.

(a) Softmax Loss (b) Softmax Loss + CCL
Figure 7: The t-SNE visualization of deep features from the training data. The feature distribution of using Softmax Loss + Conditional Center Loss (CCL) are more discriminative than using Softmax Loss only. The points of different colors denote the deep features from different classes (0: normal; 1: COVID-19; 2: other pneumonia).

4.3 Analysis

This section further gives in-depth ablation studies to demonstrate the effect of conditional center loss (CCL) and score-level cost-sensitive learning (SLCSL), respectively. We construct four ablation models based on the backbone network VGG-16. The first model is only using the cross-entropy loss with softmax called as Softmax Loss. The second model combines center loss and cross-entropy loss with softmax called as Softmax Loss + Center Loss. Similarly, the third model combines conditional center loss and softmax loss called as Softmax Loss + CCL. The final model is our algorithm DCSL that combines score-level cost-sensitive learning, conditional center loss, and softmax loss.

Generally speaking, our final model DCSL achieves the best performance than the other ablation models, as shown in Figure 6. These confusion matrixes strongly prove that our algorithm can accurately screen COVID-19 from chest X-rays without any missing case. Both Figure 8 and Figure 9 demonstrate the convergence and stability of DCSL in the training and validation period. These excellent results show that our algorithm successfully achieves accurate and robust screening of COVID-19 from chest X-rays. These extensive results once verify the correctness of our insight that this task is a fine-grained cost-sensitive classification problem.

(a) Softmax Loss
(b) Softmax Loss+Center Loss
(c) DCSL
Figure 8: Accuracy curves in the training and validation period. They show that DCSL has higher accuracy with stable convergence.
(a) Softmax Loss
(b) Softmax Loss+Center Loss
(c) DCSL
Figure 9: Loss curves in the training and validation period. It can be observed that DCSL has great stability and fast convergence.

4.3.1 Conditional Center Loss

Table 2 reports the results of out ablation study on different loss functions. Our conditional center loss (Softmax Loss + CCL) remarkably outperforms the center loss and softmax loss. These results demonstrate the importance of considering the class-conditional information when updating the center points and optimizing the center loss. Figure 7 presents the t-SNE visualization of deep features from the training data. It verifies that the conditional center loss can contribute to improving the intra-class compactness.

Moreover, Figure 6 shows that the conditional center loss has fewer mistakes and achieves a balance performance on the three classes. Also, Both Figure 8 and Figure 9 show that the conditional center loss has excellent stability and fast convergence. In summary, the conditional center loss has a significant impact on the performance of our proposed architecture. When the conditional center loss is not used, the result of classification is obviously decreased, and the learned deep features contain significant intra-class variations.

Figure 10: Classification results of the ablation studies from Softmax Loss, Softmax Loss + Center Loss (CL), Softmax Loss + Conditional Center Loss (CCL), and DCSL. Owning to the score-level cost-sensitive learning, we achieve the highest sensitivity on COVID-19 class. The conditional center loss also plays a vital role in the improvement of performance. These results demonstrate the effectiveness of DCSL.

4.3.2 Score-Level Cost-Sensitive Learning

Another goal of this work is to enhance the sensitivity of COVID-19 without decreasing the overall classification accuracy. Although all the results have verified the advantages of score-level cost-sensitive learning, we should dissect its strengths. Figure 6 shows that using score-level cost-sensitive learning achieves zeros mistake of the COVID-19 class. Moreover, Figure 10 demonstrates that DCSL makes the 100% sensitivity in COVID-19 without decreasing the overall classification accuracy. Experimental results show that DCSL can significantly improve the sensitivity and precision of COVID-19.

To conclude, cost-sensitive learning plays a crucial role in the screening of COVID-19. During the global outbreak of COVID-19, the cost of misclassifying COVID-19 patients into other types of pneumonia or even healthy people are much higher than the cost of misclassifying other classes. The proposed score-level cost-sensitive learning has significantly improved the sensitivity of COVID-19, proving our hypothesis that cost-sensitive learning is very suitable for the new task.

5 Conclusion

In this paper, we reported a new attempt for the fine-grained cost-sensitive screening of COVID-19 from chest X-rays. We proposed a novel discriminative cost-sensitive learning (DCSL) that includes a conditional center loss function and a score-level cost-sensitive learning module. To the best of our knowledge, this is the first method that formulates this novel application as a fine-grained cost-sensitive classification problem. Extensive results have demonstrated that DCSL can achieve reliable and accurate results. In-depth analyses have revealed the effectiveness and potential of DCSL as a clinical tool to relieve radiologists from laborious workloads, such that contribute to the quickly large-scale screening of COVID-19.


  • [1] A. Abbas, M. M. Abdelsamea, and M. M. Gaber (2020) Classification of covid-19 in chest x-ray images using detrac deep convolutional neural network. arXiv preprint arXiv:2003.13815. Cited by: §2.1.
  • [2] P. Afshar, S. Heidarian, F. Naderkhani, A. Oikonomou, K. N. Plataniotis, and A. Mohammadi (2020) COVID-caps: a capsule network-based framework for identification of covid-19 cases from x-ray images. arXiv preprint arXiv:2004.02696. Cited by: §2.1.
  • [3] I. D. Apostolopoulos and T. A. Mpesiana (2020) Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks. Physical and Engineering Sciences in Medicine, pp. 1. Cited by: §2.1.
  • [4] U. Brefeld, P. Geibel, and F. Wysotzki (2003) Support vector machines with example dependent costs. In European Conference on Machine Learning, pp. 23–34. Cited by: §2.2.2.
  • [5] L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen (1984) Classification and regression trees. CRC press. Cited by: §2.2.2.
  • [6] J. Chen, L. Wu, J. Zhang, L. Zhang, D. Gong, Y. Zhao, S. Hu, Y. Wang, X. Hu, B. Zheng, et al. (2020) Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: a prospective study. medRxiv. Cited by: §2.1.
  • [7] J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei (2009) Imagenet: a large-scale hierarchical image database. In

    2009 IEEE conference on computer vision and pattern recognition

    pp. 248–255. Cited by: §4.1.
  • [8] J. Deng, J. Krause, M. Stark, and L. Fei-Fei (2015) Leveraging the wisdom of the crowd for fine-grained recognition. IEEE transactions on pattern analysis and machine intelligence 38 (4), pp. 666–676. Cited by: §2.2.1.
  • [9] P. Domingos (1999) Metacost: a general method for making classifiers cost-sensitive. In Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 155–164. Cited by: §2.2.2.
  • [10] A. Dubey, O. Gupta, R. Raskar, and N. Naik (2018) Maximum-entropy fine grained classification. In Advances in Neural Information Processing Systems, pp. 637–647. Cited by: §2.2.1.
  • [11] M. Farooq and A. Hafeez (2020) COVID-resnet: a deep learning framework for screening of covid19 from radiographs. arXiv preprint arXiv:2003.14395. Cited by: §1, §2.1.
  • [12] B. Furlow (2010) Radiation dose in computed tomography. Radiologic Technology 81 (5), pp. 437–450. Cited by: §1.
  • [13] B. Ghoshal and A. Tucker (2020) Estimating uncertainty and interpretability in deep learning for coronavirus (covid-19) detection. arXiv preprint arXiv:2003.10769. Cited by: §2.1.
  • [14] O. Gozes, M. Frid-Adar, H. Greenspan, P. D. Browning, H. Zhang, W. Ji, A. Bernheim, and E. Siegel (2020) Rapid ai development cycle for the coronavirus (covid-19) pandemic: initial results for automated detection & patient monitoring using deep learning ct image analysis. arXiv preprint arXiv:2003.05037. Cited by: §2.1.
  • [15] O. Gozes, M. Frid-Adar, N. Sagie, H. Zhang, W. Ji, and H. Greenspan (2020) Coronavirus detection and analysis on chest ct with deep learning. External Links: 2004.02640 Cited by: §2.1.
  • [16] L. Hall, D. Goldgof, R. Paul, and G. M. Goldgof (2020) Finding covid-19 from chest x-rays using deep learning on a small dataset. Cited by: §2.1.
  • [17] A. E. Hassanien, L. N. Mahdy, K. A. Ezzat, H. H. Elmousalami, and H. A. Ella (2020) Automatic x-ray covid-19 lung image classification system based on multi-level thresholding and support vector machine. medRxiv. Cited by: §1, §2.1.
  • [18] K. He, X. Zhang, S. Ren, and J. Sun (2016) Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778. Cited by: §4.1.
  • [19] E. E. Hemdan, M. A. Shouman, and M. E. Karar (2020) COVIDX-net: a framework of deep learning classifiers to diagnose covid-19 in x-ray images. arXiv preprint arXiv:2003.11055. Cited by: §2.1.
  • [20] L. Huang, R. Han, T. Ai, P. Yu, H. Kang, Q. Tao, and L. Xia (2020) Serial quantitative chest ct assessment of covid-19: deep-learning approach. Radiology: Cardiothoracic Imaging 2 (2), pp. e200075. Cited by: §2.1.
  • [21] W. Huda (2007) Radiation doses and risks in chest computed tomography examinations. Proceedings of the American Thoracic Society 4 (4), pp. 316–320. Cited by: §1.
  • [22] C. Jin, W. Chen, Y. Cao, Z. Xu, X. Zhang, L. Deng, C. Zheng, J. Zhou, H. Shi, and J. Feng (2020) Development and evaluation of an ai system for covid-19 diagnosis. medRxiv. Cited by: §2.1.
  • [23] S. Jin, B. Wang, H. Xu, C. Luo, L. Wei, W. Zhao, X. Hou, W. Ma, Z. Xu, Z. Zheng, et al. (2020) AI-assisted ct imaging analysis for covid-19 screening: building and deploying a medical ai system in four weeks. medRxiv. Cited by: §2.1.
  • [24] Y. Jin, L. Cai, Z. Cheng, H. Cheng, T. Deng, Y. Fan, C. Fang, D. Huang, L. Huang, Q. Huang, et al. (2020) A rapid advice guideline for the diagnosis and treatment of 2019 novel coronavirus (2019-ncov) infected pneumonia (standard version). Military Medical Research 7 (1), pp. 4. Cited by: §1.
  • [25] D. S. Kermany, M. Goldbaum, W. Cai, C. C. Valentim, H. Liang, S. L. Baxter, A. McKeown, G. Yang, X. Wu, F. Yan, et al. (2018) Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172 (5), pp. 1122–1131. Cited by: §4.1.
  • [26] N. E. M. Khalifa, M. H. N. Taha, A. E. Hassanien, and S. Elghamrawy (2020) Detection of coronavirus (covid-19) associated pneumonia based on generative adversarial networks and a fine-tuned deep transfer learning model using chest x-ray dataset. arXiv preprint arXiv:2004.01184. Cited by: §2.1.
  • [27] S. H. Khan, M. Hayat, M. Bennamoun, F. A. Sohel, and R. Togneri (2017) Cost-sensitive learning of deep feature representations from imbalanced data. IEEE transactions on neural networks and learning systems 29 (8), pp. 3573–3587. Cited by: §2.2.2, §3.3.
  • [28] D. Lin, X. Shen, C. Lu, and J. Jia (2015) Deep lac: deep localization, alignment and classification for fine-grained recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1666–1674. Cited by: §2.2.1.
  • [29] T. Lin, A. RoyChowdhury, and S. Maji (2015) Bilinear cnn models for fine-grained visual recognition. In Proceedings of the IEEE international conference on computer vision, pp. 1449–1457. Cited by: §2.2.1.
  • [30] C. X. Ling and V. S. Sheng (2008) Cost-sensitive learning and the class imbalance problem. Vol. 2011, Citeseer. Cited by: §2.2.2.
  • [31] M. A. Maloof (2003) Learning when data sets are imbalanced and when costs are unequal and unknown. In ICML-2003 workshop on learning from imbalanced data sets II, Vol. 2, pp. 2–1. Cited by: §2.2.2.
  • [32] A. Narin, C. Kaya, and Z. Pamuk (2020) Automatic detection of coronavirus disease (covid-19) using x-ray images and deep convolutional neural networks. arXiv preprint arXiv:2003.10849. Cited by: §1, §2.1.
  • [33] S. Ozturk, U. Ozkaya, and M. Barstugan (2020) Classification of coronavirus images using shrunken features. medRxiv. Cited by: §2.1.
  • [34] F. Schroff, D. Kalenichenko, and J. Philbin (2015) Facenet: a unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 815–823. Cited by: §2.2.1.
  • [35] P. K. Sethy and S. K. Behera (2020) Detection of coronavirus disease (covid-19) based on deep features. Cited by: §1.
  • [36] F. Shan+, Y. Gao+, J. Wang, W. Shi, N. Shi, M. Han, Z. Xue, D. Shen, and Y. Shi (2020) Lung infection quantification of covid-19 in ct images with deep learning. arXiv preprint arXiv:2003.04655. Cited by: §2.1.
  • [37] F. Shi, J. Wang, J. Shi, Z. Wu, Q. Wang, Z. Tang, K. He, Y. Shi, and D. Shen (2020) Review of artificial intelligence techniques in imaging data acquisition, segmentation and diagnosis for covid-19. External Links: 2004.02731 Cited by: §2.1.
  • [38] F. Shi, L. Xia, F. Shan, D. Wu, Y. Wei, H. Yuan, H. Jiang, Y. Gao, H. Sui, and D. Shen (2020) Large-scale screening of covid-19 from community acquired pneumonia using infection size-aware classification. arXiv preprint arXiv:2003.09860. Cited by: §2.1.
  • [39] K. Simonyan and A. Zisserman (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. Cited by: §4.1, §4.1.
  • [40] Y. Song, S. Zheng, L. Li, X. Zhang, X. Zhang, Z. Huang, J. Chen, H. Zhao, Y. Jie, R. Wang, et al. (2020) Deep learning enables accurate diagnosis of novel coronavirus (covid-19) with ct images. medRxiv. Cited by: §2.1.
  • [41] C. Storrs (2013) Do ct scans cause cancer?. Scientific American 309 (1), pp. 30–33. Cited by: §1.
  • [42] X. Sun, L. Chen, and J. Yang (2019) Learning from web data using adversarial discriminative neural networks for fine-grained classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 273–280. Cited by: §2.2.1.
  • [43] Y. Sun, Y. Chen, X. Wang, and X. Tang (2014) Deep learning face representation by joint identification-verification. In Advances in neural information processing systems, pp. 1988–1996. Cited by: §2.2.1.
  • [44] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna (2016) Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826. Cited by: §4.1.
  • [45] L. Wang and A. Wong (2020) COVID-net: a tailored deep convolutional neural network design for detection of covid-19 cases from chest radiography images. arXiv preprint arXiv:2003.09871. Cited by: §1, §4.1, §4.2.
  • [46] S. Wang, B. Kang, J. Ma, X. Zeng, M. Xiao, J. Guo, M. Cai, J. Yang, Y. Li, X. Meng, et al. (2020) A deep learning algorithm using ct images to screen for corona virus disease (covid-19). medRxiv. Cited by: §1, §1, §2.1, §2.1.
  • [47] X. Wei, C. Xie, J. Wu, and C. Shen (2018) Mask-cnn: localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recognition 76, pp. 704–714. Cited by: §2.2.1.
  • [48] Y. Wen, K. Zhang, Z. Li, and Y. Qiao (2016) A discriminative feature learning approach for deep face recognition. In European conference on computer vision, pp. 499–515. Cited by: §2.2.1.
  • [49] H. Xu, G. Qi, J. Li, M. Wang, K. Xu, and H. Gao (2018) Fine-grained image classification by visual-semantic embedding.. In IJCAI, pp. 1043–1049. Cited by: §2.2.1.
  • [50] X. Xu, X. Jiang, C. Ma, P. Du, X. Li, S. Lv, L. Yu, Y. Chen, J. Su, G. Lang, et al. (2020) Deep learning system to screen coronavirus disease 2019 pneumonia. arXiv preprint arXiv:2002.09334. Cited by: §2.1.
  • [51] B. Zadrozny and C. Elkan (2001) Learning and making decisions when costs and probabilities are both unknown. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 204–213. Cited by: §2.2.2.
  • [52] B. Zadrozny, J. Langford, and N. Abe (2003) A simple method for cost-sensitive learning. Cited by: §2.2.2.
  • [53] J. Zhang, Y. Xie, Y. Li, C. Shen, and Y. Xia (2020) COVID-19 screening on chest x-ray images using deep learning based anomaly detection. arXiv preprint arXiv:2003.12338. Cited by: §1, §1, §2.1.
  • [54] N. Zhang, J. Donahue, R. Girshick, and T. Darrell (2014) Part-based r-cnns for fine-grained category detection. In European conference on computer vision, pp. 834–849. Cited by: §2.2.1.
  • [55] C. Zheng, X. Deng, Q. Fu, Q. Zhou, J. Feng, H. Ma, W. Liu, and X. Wang (2020) Deep learning-based detection for covid-19 from chest ct using weak label. medRxiv. Cited by: §2.1.
  • [56] Z. Zhou and X. Liu (2010) On multi-class cost-sensitive learning. Computational Intelligence 26 (3), pp. 232–257. Cited by: §2.2.2.