AI-MIA: COVID-19 Detection Severity Analysis through Medical Imaging

This paper presents the baseline approach for the organized 2nd Covid-19 Competition, occurring in the framework of the AIMIA Workshop in the European Conference on Computer Vision (ECCV 2022). It presents the COV19-CT-DB database which is annotated for COVID-19 detction, consisting of about 7,700 3-D CT scans. Part of the database consisting of Covid-19 cases is further annotated in terms of four Covid-19 severity conditions. We have split the database and the latter part of it in training, validation and test datasets. The former two datasets are used for training and validation of machine learning models, while the latter will be used for evaluation of the developed models. The baseline approach consists of a deep learning approach, based on a CNN-RNN network and report its performance on the COVID19-CT-DB database.


MIA-COV19D: COVID-19 Detection through 3-D Chest CT Image Analysis

Early and reliable COVID-19 diagnosis based on chest 3-D CT scans can as...

Ensemble CNN models for Covid-19 Recognition and Severity Perdition From 3D CT-scan

Since the appearance of Covid-19 in late 2019, Covid-19 has become an ac...

FDVTS's Solution for 2nd COV19D Competition on COVID-19 Detection and Severity Analysis

This paper presents our solution for the 2nd COVID-19 Competition, occur...

COVID Detection in Chest CTs: Improving the Baseline on COV19-CT-DB

The paper presents a comparative analysis of three distinct approaches b...

COVID-MTL: Multitask Learning with Shift3D and Random-weighted Loss for Diagnosis and Severity Assessment of COVID-19

The outbreak of COVID-19 has resulted in over 67 million infections with...

BIMCV COVID-19+: a large annotated dataset of RX and CT images from COVID-19 patients

In this work we describe BIMCV-COVID-19+ dataset, a large dataset from M...

Representation Learning with Information Theory for COVID-19 Detection

Successful data representation is a fundamental factor in machine learni...

1 Introduction

Medical imaging seeks to reveal internal structures of human body, as well as to diagnose and treat diseases. Radiological quantitative image analysis constitutes a significant prognostic factor for diagnosis of diseases, such as lung cancer and Covid-19. For example, CT scan is an essential test for the diagnosis, staging and follow-up of patients suffering from lung cancer in its various histologies. Lung cancer constitutes a top public health issue, with the tendency being towards patient-centered control strategies. Especially in the chest imaging field, there has been a large effort to develop and apply computer-aided diagnosis systems for the detection of lung lesions patterns.

Deep learning (DL) approaches applied to the quantitative analysis of multimodal medical data offer very promising results. Current targets include building robust and interpretable AI and DL models for analyzing CT scans for the discrimination of immune-inflammatory patterns and their differential diagnosis with SARS-CoV-2-related or other findings. Moreover, detection of the new coronavirus (Covid-19) is of extreme significance for our Society. Currently, the gold standard reverse transcription polymerase chain reaction (RT-PCR) test is identified as the primary diagnostic tool for the Covid-19; however, it is time consuming with a certain number of false positive results. The chest CT has been identified as an important diagnostic tool to complement the RT-PCR test, assisting the diagnosis accuracy and helping people take the right action so as to quickly get appropriate treatment. That is why there is on-going research worldwide to create automated systems that can predict Covid-19 from chest CT scans and x-rays.

Recent approaches use Deep Neural Networks for medical imaging problems, including segmentation and automatic detection of the pneumonia region in lungs. The public datasets of RSNA pneumonia detection and LUNA have been used as training data, as well as other Covid-19 datasets. Then a classification network is generally used to discriminate different viral pneumonia & Covid-19 cases, based on analysis of chest x-rays, or CT scans.

Various projects are implemented in this framework in Europe, US, China and worldwide. However, a rather fragmented approach is followed: projects are based on specific datasets, selected from smaller, or larger numbers of hospitals, with no proof of good performance generalization over different datasets and clinical environments. A DL methodology which we have developed using latent information extraction from trained DNNs has been very successful for diagnosis of Covid-19 through unified chest CT scans and x-rays analysis provided by different medical centers and medical environments.

At the time of CT scan recording, several slices are captured from each person suspected of COVID-19. The large volume of CT scan images calls for a high workload on physicians and radiologists to diagnose COVID-19. Taking this into account and also the rapid increase in number of new and suspected COVID-19 cases, it is evident that there is a need for using machine and deep learning for detecting COVID-19 in CT scans.

Such approaches require data to be trained on. Therefore, a few databases have been developed consisting of CT scans. However, new data sets with large numbers of 3-D CT scans are needed, so that researchers can train and develop COVID-19 diagnosis systems and trustfully evaluate their performance.

The current paper presents a baseline approach for the Competition part of the Workshop “AI-enabled Medical Image Analysis – Digital Pathology & Radiology/COVID19 (AIMIA)” which occurs in conjunction with the European Conference on Computer Vision (ECCV) 2022 in Tel-Aviv, Israel, October 23- 24, 2022.

The AIMIA-Workshop context is as follows. Deep Learning has made rapid advances in the performance of medical image analysis challenging physicians in their traditional fields. In the pathology and radiology fields, in particular, automated procedures can help to reduce the workload of pathologists and radiologists and increase the accuracy and precision of medical image assessment, which is often considered subjective and not optimally reproducible. In addition, Deep Learning and Computer Vision demonstrate the ability/potential to extract more clinically relevant information from medical images than what is possible in current routine clinical practice by human assessors. Nevertheless, considerable development and validation work lie ahead before AI-based methods can be fully ready for integrated into medical departments.

The workshop on AI-enabled medical image analysis (AIMIA) at ECCV 2022 aims to foster discussion and presentation of ideas to tackle the challenges of whole slide image and CT/MRI/X-ray analysis/processing and identify research opportunities in the context of Digital Pathology and Radiology/COVID19.

The COV19D Competition is based on a new large database of chest CT scan series that is manually annotated for Covid-19/non-Covid-19, as well as Covid-19 severity ndiagnosis. The training and validation partitions along with their annotations are provided to the participating teams to develop AI/ML/DL models for Covid-19/non-Covid-19 and Covid-19 severity prediction. Performance of approaches will be evaluated on the test sets.

The COV19-CT-DB is a new large database with about 7,700 3-D CT scans, annotated for COVID-19 infection.

The rest of the paper is organized as follows. Section 2 presents former work on which the presented baseline has been based. Section 3 presents the database created and used in the Competition. The ML approach and the pre-processing steps are described in Section 4. The obtained results, are presented in Section 5. Conclusions and future work are described in Section 6.

2 Related Work

In [2] a CNN plus RNN network was used, taking as input CT scan images and discriminating between COVID-19 and non-COVID-19 cases.

In [8], the authors employed a variety of 3-D ResNet models for detecting COVID-19 and distinguishing it from other common pneumonia (CP) and normal cases, using volumetric 3-D CT scans.

In [15], a weakly supervised deep learning framework was suggested using 3-D CT volumes for COVID-19 classification and lesion localization. A pre-trained UNet was utilized for segmenting the lung region of each CT scan slice; the latter was fed into a 3-D DNN that provided the classification outputs.

The presented approach is based on a CNN-RNN architecture that performs 3-D CT scan analysis. The method follows our previous work [4, 6, 5, 12] on developing deep neural architectures for predicting COVID-19, as well as neurodegenerative and other [7, 5, 13, 16] diseases and abnormal situations [11, 1, 14, 9].

These architectures have been applied for: a) prediction of Parkinson’s, based on datasets of MRI and DaTScans, either created in collaboration with the Georgios Gennimatas Hospital (GGH) in Athens [5], or provided by the PPMI study sponsored by M. J. Fox for Parkinson’s Research [7], b) prediction of COVID-19, based on CT chest scans, scan series, or x-rays, either collected from the public domain, or aggregated in collaboration with the Hellenic Ministry of Health and The Greek National Infrastructures for Research and Technology [4].

The current Competition follows and extends the MIA-COV19D Workshop we successfully organized virtually in ICCV 2021, on October 11, 2021. 35 Teams participated in the COV19D Competition; 18 Teams submitted their results and 12 Teams scored higher than the baseline. 21 presentations were made in the Workshop, including 3 invited talks [3].

In general, some workshops on medical image analysis have taken place in the framework of CVPR/ICCV/ECCV Conferences in the last three years. In particular:

In ICCV 2019 the ’Visual Recognition for Medical Images’ Workshop took place, including papers on the use of deep learning in medical images for diagnosis of diseases. Similarly, in CVPR 2019 and 2020 the following series of Workshops took place: ’Medical Computer Vision’ and ’Computer Vision for Microscopy Image Analysis’, dealing with general medical analysis topics. In addition, the ‘Bioimage Computing’ took place in ECCV 2020. Moreover, the issue of explainable DL and AI has been a topic of other Workshops (e.g., CVPR 2019; ICPR 2020, & 2022; ECAI Tailor 2020)

3 The COV19-CT-DB Database

The COVID19-CT-Database (COV19-CT-DB) consists of chest CT scans that are annotated for the existence of COVID-19.

COV19-CT-DB includes 3-D chest CT scans annotated for existence of COVID-19. Data collection was conducted in the period from September 1 2020 to November 30 2021. It consists of 1,650 COVID and 6,100 non-COVID chest CT scan series, which correspond to a high number of patients (more than 1150) and subjects (more than 2600). In total, 724,273 slices correspond to the CT scans of the COVID-19 category and 1,775,727 slices correspond to the non COVID-19 category.

Annotation of each CT slice has been performed by 4 very experienced (each with over 20 years of experience) medical experts; two radiologists and two pulmonologists. Labels provided by the 4 experts showed a high degree of agreement (around 98%). Each of the 3-D scans includes different number of slices, ranging from 50 to 700. This variation in number of slices is due to context of CT scanning. The context is defined in terms of various factors, such as the accuracy asked by the doctor who ordered the scan, the characteristics of the CT scanner that is used, or specific subject’s features, e.g., weight and age.

Figure 1: Four CT scan slices, two from a non-COVID-19 CT scan, on the left and two from a COVID-19 scan, on the right, including bilateral ground glass regions in lower lung lobes.

Figure 1 shows four CT scan slices, two from a non-COVID-19 CT scan, on the left and two from a COVID-19 scan, on the right. Bilateral ground glass regions are seen especially in lower lung lobes in the COVID-19 slices.

The database has been split in training, validation and testing sets.

The training set contains, in total, 1992 3-D CT scans. The validation set consists of 494 3-D CT scans. The number of COVID-19 and of Non-COVID-19 cases in each set are shown in Table 1. There are different numbers of CT slices per CT scan, ranging from 50 to 700.

Set Training Validation
COVID-19 882 215
Non-COVID-19 1110 289
Table 1: Data samples in each Set

A further split of the COVID-19 cases has been implemented, based on the severity of COVID-19, as annotated by four medical experts, in the range from 1 to 4, with 4 denoting the critical status. Table 2 describes each of these categories [10].

Category Severity Description
1 Mild Few or no Ground glass opacities. Pulmonary parenchymal involvement or absence
2 Moderate Ground glass opacities. Pulmonary parenchymal involvement %
3 Severe Ground glass opacities. Pulmonary parenchymal involvement %
4 Critical Ground glass opacities. Pulmonary parenchymal involvement %
Table 2: Description of the Severity Categories

In particular, parts of the COV19-CT-DB COVID-19 training and validation datasets have been accordingly split for severity classification in training, validation and testing sets.

The training set contains, in total, 258 3-D CT scans. The validation set consists of 61 3-D CT scans. The numbers of scans in each severity COVID-19 class in these sets are shown in Table 3.

Severity Class Training Validation
1 85 22
2 62 10
3 85 22
3 26 5
Table 3: Data samples in each Severity Class

4 The Deep Learning Approach

4.1 3-D Analysis and COVID-19 Diagnosis

The input sequence is a 3-D signal, consisting of a series of chest CT slices, i.e., 2-D images, the number of which is varying, depending on the context of CT scanning. The context is defined in terms of various requirements, such as the accuracy asked by the doctor who ordered the scan, the characteristics of the CT scanner that is used, or the specific subject’s features, e.g., weight and age.

The baseline approach is a CNN-RNN architecture, as shown in Figure 2

. At first all input CT scans are padded to have length

(i.e., consist of slices). The whole (unsegmented) sequence of 2-D slices of a CT-scan are fed as input to the CNN part. Thus the CNN part performs local, per 2-D slice, analysis, extracting features mainly from the lung regions. The target is to make diagnosis using the whole 3-D CT scan series, similarly to the annotations provided by the medical experts. The RNN part provides this decision, analyzing the CNN features of the whole 3-D CT scan, sequentially moving from slice to slice

. The outputs of the RNN part feed the output layer -with 2 units- that uses a softmax activation function and provides the final COVID-19 diagnosis.

In this way, the CNN-RNN network outputs a probability for each CT scan slice; the CNN-RNN is followed by a voting scheme that makes the final decision; the voting scheme can be either a majority voting or an at-least one voting (i.e., if at least one slice in the scan is predicted as COVID-19, then the whole CT scan is diagnosed as COVID-19, and if all slices in the scan are predicted as non-COVID-19, then the whole CT scan is diagnosed as non-COVID-19).


Figure 2: The CNN-RNN model

4.2 Pre-Processing & Implementation Details

At first, CT images were extracted from DICOM files. Then, the voxel intensity values were clipped using a window/level of Hounsfield units (HU)/ HU and normalized to the range of .

Regarding implementation of the proposed methodology: i) we utilized ResNet50 as CNN model, stacking on top of it a global average pooling layer, a batch normalization layer and dropout (with keep probability 0.8); ii) we used a single one-directional GRU layer consisting of 128 units as RNN model. The model was fed with 3-D CT scans composed of the CT slices; each slice was resized from its original size of

to . As a voting scheme, we used the at-least one.

Batch size was equal to 5 (i.e, at each iteration our model processed 5 CT scans) and the input length ’t’ was 700 (the maximum number of slices found across all CT scans). Softmax cross entropy was the utilized loss function for training the model. Adam optimizer was used with learning rate

. Training was performed on a Tesla V100 32GB GPU.

The architecture trained for Challenge 1 was then refined to tackle the severity classification problem of Challenge 2.

5 Experimental Results

This section describes a set of experiments evaluating the performance of the baseline approach.

Table 4 shows the performance of the network over the validation sets in both Challenges, after training with the training datasets, in terms of macro F1 score. The macro F1 score is defined as the unweighted average of the class-wise/label-wise F1-scores, i.e., the unweighted average of the COVID-19 class F1 score and of the non-COVID-19 class F1 score.

The main downside of the model is that there exists only one label for the whole CT scan and there are no labels for each CT scan slice. Thus, the presented model analyzes the whole CT scan, based on information extracted from each slice.

Challenge ’macro’ F1 Score
COVID-19 Detection 0.77
COVID-19 Severity Detection 0.63
Table 4: Performance of the baseline ResNet50-GRU network for the two Challenges

6 Conclusions and Future Work

In this paper we have introduced a new large database of chest 3-D CT scans, obtained in various contexts and consisting of different numbers of CT slices. We have also developed a deep neural network, based on a CNN-RNN architecture and used it for COVID-19 diagnosis on this database.

The scope of the paper is to present a baseline scheme regarding the performance that can be achieved based on analysis of the COV19-CT-DB database.

The model presented in the paper will be the basis for future expansion towards more transparent modelling of COVID-19 diagnosis.


  • [1] F. Caliva, F. S. De Ribeiro, A. Mylonakis, C. Demazi’ere, P. Vinai, G. Leontidis, and S. Kollias (2018)

    A deep learning approach to anomaly detection in nuclear reactors

    In 2018 International joint conference on neural networks (IJCNN), pp. 1–8. Cited by: §2.
  • [2] A. Khadidos, A. O. Khadidos, S. Kannan, Y. Natarajan, S. N. Mohanty, and G. Tsaramirsis (2020) Analysis of covid-19 infections on a ct image using deepsense model. Frontiers in Public Health 8. Cited by: §2.
  • [3] D. Kollias, A. Arsenos, L. Soukissian, and S. Kollias (2021) Mia-cov19d: covid-19 detection through 3-d chest ct image analysis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 537–544. Cited by: §2.
  • [4] D. Kollias, N. Bouas, Y. Vlaxos, V. Brillakis, M. Seferis, I. Kollia, L. Sukissian, J. Wingate, and S. Kollias (2020) Deep transparent prediction through latent representation analysis. arXiv preprint arXiv:2009.07044. Cited by: §2, §2.
  • [5] D. Kollias, A. Tagaris, A. Stafylopatis, S. Kollias, and G. Tagaris (2018) Deep neural architectures for prediction in healthcare. Complex & Intelligent Systems 4 (2), pp. 119–131. Cited by: §2, §2.
  • [6] D. Kollias, Y. Vlaxos, M. Seferis, I. Kollia, L. Sukissian, J. Wingate, and S. D. Kollias (2020) Transparent adaptation in deep medical image diagnosis.. In TAILOR, pp. 251–267. Cited by: §2.
  • [7] S. Kollias, L. Bidaut, J. Wingate, I. Kollia, et al. (2020) A unified deep learning approach for prediction of parkinson’s disease. IET Image Processing. Cited by: §2, §2.
  • [8] Y. Li and L. Xia (2020) Coronavirus disease 2019 (covid-19): role of chest ct in diagnosis and management. American Journal of Roentgenology 214 (6), pp. 1280–1286. Cited by: §2.
  • [9] L. Malatesta, A. Raouzaiou, K. Karpouzis, and S. Kollias (2009)

    Towards modeling embodied conversational agent character profiles using appraisal theory predictions in expression synthesis

    Applied intelligence 30 (1), pp. 58–64. Cited by: §2.
  • [10] S. P. Morozov, A. E. Andreychenko, I. A. Blokhin, P. B. Gelezhe, A. P. Gonchar, A. E. Nikolaev, N. A. Pavlov, V. Y. Chernina, and V. A. Gombolevskiy (2020) MosMedData: data set of 1110 chest ct scans performed during the covid-19 epidemic. Digital Diagnostics 1 (1), pp. 49–59. Cited by: §3.
  • [11] P. Mylonas, E. Spyrou, Y. Avrithis, and S. Kollias (2009) Using visual context and region semantics for high-level concept detection. IEEE Transactions on Multimedia 11 (2), pp. 229–243. Cited by: §2.
  • [12] A. Tagaris, D. Kollias, A. Stafylopatis, G. Tagaris, and S. Kollias (2018) Machine learning for neurodegenerative disorder diagnosis—survey of practices and launch of benchmark dataset.

    International Journal on Artificial Intelligence Tools

    27 (03), pp. 1850011.
    Cited by: §2.
  • [13] P. Tzouveli, A. Schmidt, M. Schneider, A. Symvonis, and S. Kollias (2008) Adaptive reading assistance for the inclusion of students with dyslexia: the agent-dysl approach. In 2008 Eighth IEEE International Conference on Advanced Learning Technologies, pp. 167–171. Cited by: §2.
  • [14] M. Wallace, N. Tsapatsoulis, and S. Kollias (2005) Intelligent initialization of resource allocating rbf networks. Neural Networks 18 (2), pp. 117–122. Cited by: §2.
  • [15] X. Wang, X. Deng, Q. Fu, Q. Zhou, J. Feng, H. Ma, W. Liu, and C. Zheng (2020) A weakly-supervised framework for covid-19 classification and lesion localization from chest ct. IEEE transactions on medical imaging 39 (8), pp. 2615–2625. Cited by: §2.
  • [16] M. Yu, D. Kollias, J. Wingate, N. Siriwardena, and S. Kollias (2021) Machine learning for predictive modelling of ambulance calls. Electronics 10 (4), pp. 482. Cited by: §2.