Point of Care Image Analysis for COVID-19

10/28/2020
by   Daniel Yaron, et al.
0

Early detection of COVID-19 is key in containing the pandemic. Disease detection and evaluation based on imaging is fast and cheap and therefore plays an important role in COVID-19 handling. COVID-19 is easier to detect in chest CT, however, it is expensive, non-portable, and difficult to disinfect, making it unfit as a point-of-care (POC) modality. On the other hand, chest X-ray (CXR) and lung ultrasound (LUS) are widely used, yet, COVID-19 findings in these modalities are not always very clear. Here we train deep neural networks to significantly enhance the capability to detect, grade and monitor COVID-19 patients using CXRs and LUS. Collaborating with several hospitals in Israel we collect a large dataset of CXRs and use this dataset to train a neural network obtaining above 90 with ULTRa (Ultrasound Laboratory Trento, Italy) and hospitals in Italy we obtained POC ultrasound data with annotations of the severity of disease and trained a deep network for automatic severity grading.

READ FULL TEXT VIEW PDF
05/24/2020

Predicting COVID-19 Pneumonia Severity on Chest X-ray with Deep Learning

The need to streamline patient management for COVID-19 has become more p...
05/20/2020

ProgNet: Covid-19 prognosis using recurrent and convolutional neural networks

Humanity is facing nowadays a dramatic pandemic episode with the Coronav...
11/30/2020

Ultrasound Diagnosis of COVID-19: Robustness and Explainability

Diagnosis of COVID-19 at point of care is vital to the containment of th...
08/06/2021

Lung Ultrasound Segmentation and Adaptation between COVID-19 and Community-Acquired Pneumonia

Lung ultrasound imaging has been shown effective in detecting typical pa...
04/04/2021

Detection of COVID-19 Disease using Deep Neural Networks with Ultrasound Imaging

The new coronavirus 2019 (COVID-2019) has rapidly become a pandemic and ...
05/20/2021

POCFormer: A Lightweight Transformer Architecture for Detection of COVID-19 Using Point of Care Ultrasound

The rapid and seemingly endless expansion of COVID-19 can be traced back...

1 Introduction

Coronavirus Disease 2019 (COVID-19) was declared a global pandemic [11], and has had severe economic, social and healthcare consequences. In order to contain the disease, an immediate concern is to rapidly identify and isolate SARS-CoV-2 carriers. This requires means for mass testing of the general population, with low cost, high sensitivity and fast processing times. The prevalent test today is Reverse Transcription Polymerase Chain Reaction (RT-PCR) [21, 18], which suffers from a number of problems: Testing reagents and kits are expensive and suitable for single-use only, processing the samples requires dedicated personnel and equipment, and it can take hours or days to obtain results. Most significantly, the test has a limited sensitivity rate of as low as 71% [3, 1]. Due to these shortcomings, finding alternative testing and identification methods is crucial. A strong candidate is diagnosis of patients based on medical imaging of the chest, since COVID-19 presents primarily in the lower respiratory tract. Medical imaging, specifically computerized tomography (CT) scans, chest X-ray (CXR), and lung ultrasound (LUS), can provide an alternative approach, affording advantages that can readily complement the testing capabilities of RT-PCR. In the case of COVID-19, disease characteristics such as consolidations and ground-glass opacities can be identified in images of the lung [9, 4] which raises the possibility of using chest and lung imaging for detection and severity grading of COVID-19 patients.

Physiological pulmonary properties of COVID-19 on CT scans are typically detectable and clear to see [9]. However, the availability of CT equipment is limited both by its price and operational requirements such as rooms and staff. Moreover, there is a need to decontaminate the machine between suspected COVID-19 patients, a lengthy process that results in a very slow rate of scanning. In contrast, with portable X-ray and ultrasound machines, imaging can be done rapidly and without needing to bring patients into radiography rooms. These machines are also less costly, use less radiation, and can be readily distributed and deployed to point-of-care (POC) locations outside hospitals and primary care centers. The drawback of these modalities is that their analysis requires qualified personnel and the unique characteristics of the prognostic properties of the images can make them much harder to analyze than CT scans.

In this paper we consider a combination of signal processing and deep learning tools to develop deep network architectures that can lead to high detection rates of COVID-19 and to severity grading of disease from POC imaging using X-ray and ultrasound. Early approaches to develop detection methods based on X-ray used data from publicly available image sources relied on limited data containing compressed images with lack of detail, and coming from many different makes and models of x-ray machines

[14, 22, 2]. This illustrates one of the main challenges in this field, namely the collection of large amounts of COVID-19 positive and negative images of full resolution and similar sources. Notably, one recent effort has shown more reliable results based on a larger dataset that was more uniformly sourced and comes closer to the goal of developing tools that can be used in clinical settings [23]. They achieved a sensitivity of 88% with a specificity of 79%.

Here, we collected a large dataset of images from portable X-rays and used them to train a network that can detect COVID-19 with high reliability. Our algorithm adds external information in the form of lung segmentation based on a deep learning model, which together with several other pre-processing methods, boosts performance to over 90% detection rate. We further develop a tool for retrieving CXR images that are similar to a given query image based on that image’s embedding in a low-dimensional space generated by the model.

For LUS, our goal is to develop a network with high reliability to grade disease severity. To this end, we rely on ICLUS dataset presented in Roy et al. [13]111https://iclus-web.bluetensor.ai

. There, the authors propose a sophisticated neural network to automatically predict the severity grade from annotated LUS frames, which results in an F1 score (the harmonic mean of precision and recall) of 65.1%. Here we enhance their method by developing a signal processing approach to “rectify” images taken by convex probes, and then similar to the X-ray network, we input both the original and rectified images, creating a channel of “side information”. This results in an F1 score of 68.7%.

In the following sections we detail the approach taken in developing the networks for both the X-ray and LUS data. In both cases signal processing tools are used to pre-process the data and to form an additional input channel that boosts the network results.

2 COVID19 Detection based on X-ray

Deep learning approaches have shown impressive abilities in image related tasks, including in many radiological contexts [20] [7]. However, despite their potential in assisting COVID-19 management efforts, these methods require large amounts of training data. In order to address this challenge, a large dataset of images from portable X-rays was sourced and used to train a network that can detect COVID-19 in the images with high reliability, and to develop a tool for retrieving CXR images that are similar to each other. We rely on a combination of deep learning tools including standard pre-processing and augmentations of the images and additional information in the form of an extra lung segmentation channel.

We collected CXR images from 1384 patients, 360 with a positive COVID-19 diagnosis and 1024 negative, totaling 2426 CXRs. All COVID-19 negative images come from before the start of the pandemic. Patients’ COVID-19 positive labels were determined according to positive RT-PCR testing. The COVID-19 positive images include all CXRs performed with portable X-ray machines on patients admitted to four hospitals in Israel. For the non COVID-19 images we obtained CXRs taken by the same X-ray machines prior to December 2019. These are patients without COVID-19, typically with another respiratory disease. The test set was taken from the full CXR dataset and contains 350 CXR (15%) of which 179 (51%) are positive for COVID-19 and 171 (49%) are negative. Many patients have multiple images. To deal with this, each patient’s images were used either for the test set or for the train set, but never for both. This is done to prevent the model from identifying patient-specific image features (e.g., medical implants) and associating them with the label. In the analysis, 4% (101/2426) of the images were excluded due to lateral positioning, or due to rectangular artifacts in the image, of these 98 were COVID-19 positive.

Figure 1: Full pipeline workflow overview for Covid-19 classification model of X-ray images.

2.1 Image Processing and Network architecture

The model pipeline (Fig. 1), begins with a series of preprocessing steps, including augmentation, normalization, and segmentation of the images. Augmentations are transformations that change features such as image orientation and brightness. These properties are irrelevant for correct classification, but may vary during image acquisition, and can affect the training performance of the network because of its rigid registration with respect to orientation and pixel values. Importantly, augmentations should correspond to normal variation in CXR acquisition; to ensure this we consulted with radiologists when defining the augmentation parameters. Not all augmentations are applied each time, but rather each augmentation has a certain probability of being applied, represented by p: brighten (p=0.4); gamma contrast (p=0.3); CLAHE (p=0.4); rotate

(p=0.4); shear (p=0.4); scale up to 0.2 on each axis (p=0.4); flip from left to right (p=0.5); either sharpen or apply Gaussian blur; horizontal flip (p =0.5).

The normalization process consisted of cropping black edges, standardizing the brightness and scaling the size of each image to 1024X1024 pixels using bilinear interpolation.

To enhance performance we created an additional image channel using lung segmentation via a U-net [12] pre-trained on a different dataset. This network produces a pixel-mask of the CXR indicating the probability that each pixel belongs in the lungs, allowing the network to access this information while training. The final input images to the network contain 3 channels: the original CXR, the segmentation map, and one filled with zeroes. This is done to accommodate the pre-trained models we used that use 3-channel RGB images.

We compared five networks: ResNet34, ResNet50, ResNet152 [5], VGG16 [15] and Chexpert [7]

, all trained using transfer learning. We additionally classify the images with an ensemble model by averaging over the results of the first four networks, with exclusion of Chexpert due to low results. Training was performed with the Adam optimizer with an initial learning rate of 1e-6 which was exponentially decreased as epochs progressed. We used cross-entropy as a loss function with an

regulariser with regularisation coefficient . The best test scores were achieved after 32 epochs.

In addition to classification, we propose a method for retrieving a number of CXR images that are the most similar to a given image. The activation of layers of the neural network serve as embeddings of the images into a vector space, and should capture information about clinical indications observed in the images. We use the embeddings obtained from the networks last layer to search for similarity between the resulting vectors, and retrieve the nearest neighbors of each image.

Model Accuracy Sensitivity Specifity
% (/350) % (/179) % (/171)
ResNet34 89.4 (313) 87.1 (156) 91.8 (157)
ResNet50 89.7 (314) 87.1 (156) 92.4 (158)
ResNet50 - No
preprocessing
85.1 (298) 82.1 (147) 88.3 (151)
ResNet152 86.0 (304) 83.2 (149) 90.6 (155)
Chexpert 84.0 (294) 86.5 (155) 81.2 (139)
VGG16 87.7 (306) 87.1 (156) 88.3 (151)
Ensemble 90.5 (317) 91.1 (164) 90.0 (154)
Table 1: Performance of different models: Comparison of accuracy, sensitivity and specificity of various deep networks trained and tested on the same test set.

2.2 Evaluation and Results

We trained five deep networks whose accuracy, sensitivity (detection rate) and specificity rates can be seen in Table 1. We selected ResNet50 and the ensemble model for the rest of the analysis, as they achieved the best performance in our task. The ensemble model achieved accuracy 90.6%, (95% CI: 84%, 92.9%) sensitivity 91.1% (95% CI: 83%, 92.4%) and specificity of 90% (95% CI: 80.7%, 95.9%) on the test images.

The area under the curve (AUC) of the ROC curve is 0.96 (95% CI: 0.92, 0.97). The ROC curve is provided in Fig. 2a, and demonstrates that for a broad range of thresholds, both high true positive rate (TPR) and low false positive rate (FPR) are achievable. In Fig. 2b we present the precision-recall (P-R) curve, which shows the trade-off between precision and recall (sensitivity) as the value of the threshold is varied. This P-R curve shows a broad range of thresholds for which both high precision and high recall are attainable. The AUC of the P-R is 0.96 (95% CI: 0.93, 0.98). These ROC and P-R curves attest to the stability of the model across different thresholds. We additionally train ResNet50 on the dataset with and without all the preprocessing stages. As seen in Table 1, preprocessing incurs an improvement of 4% in accuracy and 5% in sensitivity.

(a) (b)
Figure 2: Model performance: (a)  Receiver Operating Characteristic (ROC) curve. The area under the ROC curve is 0.96 and the classification threshold we used is marked, where true positive rate (TPR) is 91.1% and false positive rate (FPR) is 10%; (b) Precision-Recall curve. The area under the curve is 0.96 and the classification threshold we used is marked where the model achieved precision of 90% and recall of 91%.
Figure 3: t-distributed Stochastic Neighbor Embedding (t-SNE): The arrangement of the points, as a result of transforming the features of each image, shows two main distinct clusters, indicating strong association of the features with the ground truth.

We additionally visualize the distinction made by the model using t-distributed Stochastic Neighbor Embedding (t-SNE) [19] which uses a nonlinear method to reduce high dimensional feature vectors into two dimensions, as seen in Fig. 3. This makes it possible to visualize the data points and reveal similarities and dissimilarities between them. We used one of the last layers of the networks, which essentially provides an embedding of the images into a vector space. These vector embeddings of the images are given as input to the t-SNE which transformed each vector to a data point on the 2 dimensional space. Each data point was then colored according to its ground truth. We see that the arrangement of dots, representing features of the images colored by their true labels, depicts two distinct clusters, revealing a similarity between most of the images with the same label.

Finally, we applied K-Nearest Neighbors (KNN) on the image embeddings in order to retrieve images similar to each other. For each image we retrieve 4 images with the closest image embeddings; averaging over these images’ predictions achieves 87% accuracy and 83.2% sensitivity, meaning that the nearest images typically have the same labels.

3 Severity Grading using Ultrasound

Figure 4: Lung Ultrasound: (a) A LUS frame captured using a convex probe. Pleural line is indicated by green, and the red arrow points at consolidations. Sonographic artifacts appearing as bright vertical lines (“B-lines”) indicated by the yellow arrows. The B-lines are not axis aligned and appear to emit for a single focal point. (b) The same frame rectified using its polar coordinates. The B-lines now appear axis-aligned, while the pleural line is distorted.

It has been recently shown that Lung Ultrasound (LUS) can be used to assess the severity of COVID19 patients [17, 10, 16]. Soldati et al. [17] devised a 4-scale grading system for POC LUS scans. The grades are based on observing both anatomical features, (e.g., integrity of the pleural line and presence of consolidations) as well as sonographic artifacts (aka “A-lines” and “B-lines”). Fig. 4a shows a typical LUS frame with prominent anatomical and sonographic artifacts annotated. These features can be observed regardless of the probe used (linear or convex). However, the orientation of the sonographic artifacts changes between probe type: While they appear “axis-aligned” when using a linear probe, when using a convex probe, the artifacts appear “tilted” as if emitted from the focus point of the probe. This difference has little effect on a human observer, but can be confusing for an automatic grading system.

Roy et al. [13]

proposed a sophisticated neural network to automatically predict the proposed 4-scale clinical score from LUS frames, for frames captured by both linear and convex probes. They treat all frames in the same manner and rely on a Spatial Transformer Network 

[8] to focus on relevant parts of the frame and overcome the difference between “convex” and “linear” frames. They collected a dataset of 58,924 LUS frames of which 78% were captured using a convex probe, and used this data to train and evaluate their network.

In this work we took a more explicit approach to deal with the discrepancy between frames captured with linear and convex probes and suggest a simpler network with improved performance. We noted that frames captured with convex probes are formed to depict the anatomy as accurate as possible. However, forming the frames this way results in tilting the sonographic artifacts making them appear diagonal rather than axis-aligned. Therefore, one can rectify the frame using its underlying polar coordinates, making the sonographic artifacts appear axis-aligned, as in frames captured using a linear probe. This process is exemplified in Fig 4b. Once we make the artifacts axis aligned it is easier for a standard convolutional architecture to handle this type of information.

We then trained an ensemble of two Resnet-18 [6] networks: One receives the original frames as input and the second processes the rectified frames (for frames captured with a linear probe we treated the rectification as an identity). We used the same dataset and train/test split as in [13].

Model Settings 1 Settings 2
Drop Transision Frames ()
All Frames =1 =3 =5 =7
Resnet-18 [13] 62.2 63.9 65.5 66.9 67.8
Full [13] 65.1 66.7 68.3 69.5 70.3
Ours 68.7 70.0 72.1 73.9 75.3
Table 2: Grading severity of LUS frames: Comparing F1 scores (%) to [13] using the same dataset and evaluation settings (for definition see [13]). Our Resnet-18 based ensemble surpasses even the complicated STN-based model of [13].

Table 2 compares F1 scores of our proposed system to that of [13]. The results shows that explicitly treating the problematic alignment of the sonographic artifacts allows us to achieve better F1 scores with a simpler architecture compared to [13]. This demonstrates the potential of LUS as a method for grading the severity of patients.

4 Conclusion

In this paper we showed how a combination of signal processing and machine learning tools can enhance imaging in COVID-19 patients. We first demonstrated how proper pre-processing and lung segmentation combined with transfer learning can lead to a simple deep network for COVID19 detection based on portable X-ray images, reaching a detection rate of over 90%. We then showed how proper rectification of ultrasound images can result in an efficient deep network for grading severity from LUS data. We believe this work can pave the way to a wider use of available POC modalities in treating COVID-19 patients.

References

  • [1] I. Arevalo-Rodriguez, D. Buitrago-Garcia, D. Simancas-Racines, et al. (2020) FALSE-negative results of initial rt-pcr assays for covid-19: a systematic review. medRxiv. External Links: Document, Link, https://www.medrxiv.org/content/early/2020/04/21/2020.04.16.20066787.full.pdf Cited by: §1.
  • [2] M. M. B. Asif Iqbal Khan (2020) CoroNet: a deep neural network for detection and diagnosis of covid-19 from chest x-ray images. Computer Methods and Programs in Biomedicine 196, pp. 105–581. External Links: Document, Link Cited by: §1.
  • [3] L. L. L. D. L. X. et al. (2020) Modes of contact and risk of transmission in covid-19 among close contacts. medRxiv. External Links: Document, Link, https://www.medrxiv.org/content/early/2020/03/26/2020.03.24.20042606.full.pdf Cited by: §1.
  • [4] Y. Fang, H. Zhang, J. Xie, M. Lin, L. Ying, P. Pang, and W. Ji (2020) Sensitivity of chest ct for covid-19: comparison to rt-pcr. Radiology 296 (2), pp. E115–E117. Note: PMID: 32073353 External Links: Document, Link, https://doi.org/10.1148/radiol.2020200432 Cited by: §1.
  • [5] K. He, X. Zhang, S. Ren, and J. Sun (2015) Deep residual learning for image recognition. External Links: 1512.03385 Cited by: §2.1.
  • [6] K. He, X. Zhang, S. Ren, and J. Sun (2016) Deep residual learning for image recognition. In

    Proceedings of the IEEE conference on computer vision and pattern recognition

    ,
    pp. 770–778. Cited by: §3.
  • [7] I. J., R. P., and K. M. et al. (2019) CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. External Links: 1901.07031 Cited by: §2.1, §2.
  • [8] M. Jaderberg, K. Simonyan, A. Zisserman, et al. (2015)

    Spatial transformer networks

    .
    In Advances in neural information processing systems, pp. 2017–2025. Cited by: §3.
  • [9] W. Kong and P. P. Agarwal (2020) Chest imaging appearance of covid-19 infection. Radiology: Cardiothoracic Imaging 2 (1), pp. e200028. External Links: Document, Link, https://doi.org/10.1148/ryct.2020200028 Cited by: §1, §1.
  • [10] Y. Lichter, Y. Topilsky, P. Taieb, A. Banai, A. Hochstadt, I. Merdler, A. G. Oz, J. Vine, O. Goren, B. Cohen, et al. (2020) Lung ultrasound predicts clinical course and outcomes in covid-19 patients. Intensive care medicine, pp. 1–11. Cited by: §3.
  • [11] W. H. Organization (2020)(Website) External Links: Link Cited by: §1.
  • [12] O. Ronneberger, P. Fischer, and T. Brox (2015) U-net: convolutional networks for biomedical image segmentation. External Links: 1505.04597 Cited by: §2.1.
  • [13] S. Roy, W. Menapace, S. Oei, B. Luijten, E. Fini, C. Saltori, I. Huijben, N. Chennakeshava, F. Mento, A. Sentelli, et al. (2020) Deep learning for classification and localization of COVID-19 markers in point-of-care lung ultrasound. IEEE Transactions on Medical Imaging. Cited by: §1, Table 2, §3, §3, §3.
  • [14] F. M. shah, S. K. S. Joy, F. Ahmed, M. Humaira, A. S. Ami, S. Paul, and A. R. K. Jim (2020-07) A comprehensive survey of covid-19 detection using medical images. engrXiv. External Links: Link, Document Cited by: §1.
  • [15] K. Simonyan and A. Zisserman (2015) Very deep convolutional networks for large-scale image recognition. External Links: 1409.1556 Cited by: §2.1.
  • [16] A. Smargiassi, G. Soldati, E. Torri, F. Mento, D. Milardi, P. D. Giacomo, G. De Matteis, M. L. Burzo, A. R. Larici, M. Pompili, et al. (2020) Lung ultrasound for covid-19 patchy pneumonia: extended or limited evaluations?. Journal of Ultrasound in Medicine. Cited by: §3.
  • [17] G. Soldati, A. Smargiassi, R. Inchingolo, D. Buonsenso, T. Perrone, D. F. Briganti, S. Perlini, E. Torri, A. Mariani, E. E. Mossolani, et al. (2020) Proposal for international standardization of the use of lung ultrasound for patients with covid-19: a simple, quantitative, reproducible method. Journal of Ultrasound in Medicine. Cited by: §3.
  • [18] B. Udugama, P. Kadhiresan, H. N. Kozlowski, A. Malekjahani, et al. (2020-04-28) Diagnosing covid-19: the disease and tools for detection. ACS Nano 14 (4), pp. 3822–3835. External Links: ISSN 1936-0851, Document, Link Cited by: §1.
  • [19] L. van der Maaten and G. Hinton (2008) Visualizing data using t-sne. Journal of Machine Learning Research 9 (86), pp. 2579–2605. External Links: Link Cited by: §2.2.
  • [20] R. J. G. van Sloun, R. Cohen, and Y. C. Eldar (2020) Deep learning in ultrasound imaging. Proceedings of the IEEE 108 (1), pp. 11–29. Cited by: §2.
  • [21] C.B.F. Vogels, A.F. Brito, A.L. Wyllie, and others. (2020-10-01) Analytical sensitivity and efficiency comparisons of sars-cov-2 rt–qpcr primer–probe sets. Nature Microbiology 5 (10), pp. 1299–1305. External Links: ISSN 2058-5276, Document, Link Cited by: §1.
  • [22] L. Wang and A. Wong (2020)

    COVID-net: a tailored deep convolutional neural network design for detection of covid-19 cases from chest x-ray images

    .
    External Links: 2003.09871 Cited by: §1.
  • [23] R. Zhang, X. Tie, Z. Qi, et al. (0)

    Diagnosis of covid-19 pneumonia using chest radiography: value of artificial intelligence

    .
    Radiology 0 (0), pp. 202944. Note: PMID: 32969761 External Links: Document, Link, https://doi.org/10.1148/radiol.2020202944 Cited by: §1.