On the Use of Time Series Kernel and Dimensionality Reduction to Identify the Acquisition of Antimicrobial Multidrug Resistance in the Intensive Care Unit

The acquisition of Antimicrobial Multidrug Resistance (AMR) in patients admitted to the Intensive Care Units (ICU) is a major global concern. This study analyses data in the form of multivariate time series (MTS) from 3476 patients recorded at the ICU of University Hospital of Fuenlabrada (Madrid) from 2004 to 2020. 18% of the patients acquired AMR during their stay in the ICU. The goal of this paper is an early prediction of the development of AMR. Towards that end, we leverage the time-series cluster kernel (TCK) to learn similarities between MTS. To evaluate the effectiveness of TCK as a kernel, we applied several dimensionality reduction techniques for visualization and classification tasks. The experimental results show that TCK allows identifying a group of patients that acquire the AMR during the first 48 hours of their ICU stay, and it also provides good classification capabilities.



There are no comments yet.



SAX Navigator: Time Series Exploration through Hierarchical Clustering

Comparing many long time series is challenging to do by hand. Clustering...

Positive blood culture detection in time series data using a BiLSTM network

The presence of bacteria or fungi in the bloodstream of patients is abno...

Critical Transitions in Intensive Care Units: A Sepsis Case Study

Progression of complex human diseases is associated with transitions acr...

Benefit-aware Early Prediction of Health Outcomes on Multivariate EEG Time Series

Given a cardiac-arrest patient being monitored in the ICU (intensive car...

Use of Emergency Departments by Frail Elderly Patients: Temporal Patterns and Case Complexity

Emergency department (ED) care for frail elderly patients is associated ...

The intrinsic value of HFO features as a biomarker of epileptic activity

High frequency oscillations (HFOs) are a promising biomarker of epilepti...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Longitudinal Electronic Health Records (EHR), which thoroughly collect patient health information over time, have proven to be one of the most relevant data sources for tasks such as early prediction of anastomosis leakage (Soguero-Ruiz et al., 2016), characterization of patient health-status (Chushig-Muzo et al., 2020), and prediction of type 2 diabetes (Garcia-Carretero et al., 2020). However, many challenges have been raised when analyzing temporal EHR-based data. Such multivariate time series (MTS) can be characterized by missing values, different length and possibly dependent variables (Mikalsen et al., 2018). To deal with these issues, several methods have been proposed to exploit temporal clinical data (Mikalsen et al., 2018). Among them, we explore the potential of the time-series cluster kernel (TCK), which computes the pairwise similarities between time series with missing data. The created kernel matrix can be used for many different purposes, such as dimensionality reduction (DR) or classification.

Learning compressed representations of MTS make data analysis easier in the presence of redundant data, as well as for a high number of variables and time steps. Traditional DR algorithms are designed for vectorial data. However, in this paper, we leverage the potential of TCK to map high-dimensional into much lower-dimensional space. Towards that end, representing learning, i.e., transforming the input space to a new feature representation space by linear and non-linear approaches, are considered. The learning compressed representations of MTS can be used to identify visually patients with specific clinical characteristics. On the other hand, this new space can be considered as the input space for linear and non-linear classifiers.

The described methodology is applied in this work to identify the acquisition of antimicrobial multidrug resistance (AMR) in the Intensive Care Unit (ICU). This is a growing problem that jeopardizes seven decades of medical progress since antibiotics were first used in clinical practice (Organization and others, 2014). The misuse and overuse of antibiotics have resulted in bacteria being resistant to one or more antibiotics, no longer responding to drugs that they were initially sensitive to. The lack of antimicrobial effectiveness could increase the risk when treating infections, becoming impossible or extremely difficult to find a suitable treatment to cure them (Organization and others, 2014). This situation is even more critical in the ICU due to the delicate health condition of the patients in this unit.

As a consequence, AMR is causing a significant social and economic burden worldwide (World Health Organization, 2015)

. Antibiotic resistance is estimated to be responsible for nearly 300 million premature deaths and considerable economic losses by 2050, according to a recent study 

(Munita and Arias, 2016). The overall economic cost of AMR was predicted to be approximately 1.5 billion euros, with hospital expenditures accounting for 900 million (Prestinaci et al., 2015). This paper, therefore, proposes an approach to earlier identify the development and spread of AMR in the ICU. Towards that end, MTS associated with the use of antibiotics in this unit are analyzed.

The structure of this paper is as follows. Section 2 provides an overview of the data and the methods used in the paper. Section 3 presents the experimental results, whereas discussion and conclusions are included in Section 4.

2. Data and methods

2.1. Data

The dataset used in the current study consisted of MTS extracted from the EHR of the ICU at the University Hospital of Fuenlabrada from 2004 until 2020. From 3476 patients admitted to the ICU during that period, 628 patients developed AMR. Each patient is characterized by MTS related to the family of antibiotics taken by a specific patient during his/her ICU stay, as well as the antibiotics taken by patients who shared the clinical unit during the stay of the patient to be studied. Moreover, we count the number of patients who shared the clinical unit and the number of AMR patients at a given time (24 hours slot). We also analyze if the patient has been assisted with mechanical ventilation. The family of antibiotics considered in this work are: Aminoglycosides (AMG), Antifungals (ATF), Carbapenemes (CAR), 1st generation Cephalosporins (CF1), 2nd generation Cephalosporins (CF2), 3rd generation Cephalosporins (CF3), 4th generation Cephalosporins (CF4), unclassified antibiotics (Others), Glycyclines (GCC), Glycopeptides (GLI), Lincosamides (LIN), Lipopeptides (LIP), Macrolides (MAC), Monobactamas (MON), Nitroimidazolics (NTI), Miscellaneous (OTR), Oxazolidinones (OXA), Broad-Spectrum Penicillins (PAP), Penicillins (PEN), Polypeptides (POL), Quinolones (QUI), Sulfamides (SUL) and Tetracyclines (TTC).

On average, the first multidrug resistance is detected within seven days after patient admission to ICU, similar to the average length of stay of non-AMR patients. Based on these results, we determine to be seven days the length of the longest MTS. Therefore, we fill with zero values the time observation of patients whose stays in the ICU were less than seven days. If the length of stay is longer than seven days, we consider the information corresponding to the last seven days closest to the detection of the first AMR. For non-AMR patients, and based on clinical knowledge, the patient’s admission to the ICU is the reference (see Figure 1 for details).

Hence, the dataset is represented as , where the -th patient is represented by the temporal matrix and the output , which identifies if a patient acquired (“1”) or not (“0”) an AMR during his/her stay in the ICU. The matrix modelled for the -th patient time series, each of them defined by a number of observations , as follows: ,

, with the column vector

having length for all and .

Figure 1. Schema of the 7-days time window considered for AMR patients (upper panel) and non-AMR patients (bottom panel).

2.2. Methods

MTS have been analyzed in a variety of applications such as financial or health (Chatfield, 2003). From a theoretical point of view, several studies have considered a classical approach aiming to deal with MTS by extracting handcrafted features from raw data (Soguero-Ruiz et al., 2015, 2016). Others have focused on computing the pairwise learning similarities between the time series, such as dynamic time warping (DTW) (Wang et al., 2013; Mikalsen et al., 2021). However, many are not suitable for kernel methods due to not satisfying the condition of being positive semi-definite.

A method known as time series cluster kernel (TCK) is employed in this study. This method is based on ensemble learning approaches and probabilistic models known as Gaussian Models (GMMs). GMMs are fitting to a randomly chosen subset of MTS, features and time segments by considering different numbers of mixture components and random initial conditions. To estimate model parameters (time-dependent means, covariance matrix, and the variance of the attribute) when dealing with missing data, the likelihoods are multiplied with informative priors for the parameters, and maximum a posteriori expectation-maximization is considered 

(Marlin et al., 2012)

. After convergence, the posterior probability of each GMM is obtained. The inner products between pairs of posterior probabilities provided by each partition are summing up to build the kernel matrix, following the ensemble strategy. Therefore, given a GMM ensemble, we compute the TCK by exploiting the fact that the sum of kernels is itself a kernel. Since TCK procedure generates partitions at different resolutions that capture both local and global structures in the data, it can capture local and global relationships in the underlying data, it is robust to outliers and parameter-free. More details on the TCK are provided in 

(Mikalsen et al., 2018). We evaluate the potential of the learned representations (kernel) for dimensionality reduction, visualization and classification tasks.

Regarding dimensionality reduction, we focus on linear and non-linear dimensionality reduction methods to represent the embedding of the EHR MTS in the TCK space. Principal Component Analysis (PCA) is considered to explore the linear transformations 

(Anowar et al., 2021), whereas kernel PCA (KPCA) and autoencoders (AE) are considered as non-linear dimensionality reduction approaches. Note that AE are used to learn data representations in deep architectures, see (Vincent et al., 2008) for more details. To visualize data in two dimensions, we apply t-Distributed Stochastic Neighbor Embedding (t-SNE) (Van der Maaten and Hinton, 2008).

Regarding classification, the learning representation is used as the input to different classifiers. In this work, we apply linear (Logistic Regression, LR) and non-linear classifiers (k-nearest neighbour, k-NN; decision trees; random forest; support vector machines, SVM; nu-SVM; and multilayer perceptron, MLP). Due to space limitations, we do not describe the classifiers here, but for the interested reader, we refer to 

(Bishop, 2006).

All experiments were performed using Python language, and to model the AE, we used Keras.

3. Results

This section aims to evaluate the effectiveness of the TCK by applying different dimensionality reduction techniques: PCA, KPCA and AE. After using these methods, the resulting learning representations are used for 2D visualization using t-SNE and for classification purposes. A summary of the process followed in this work is shown in Figure 2. The original dataset is separated into two subsets, training and test, which account for 70% and 30% of the patients, respectively (Caruana et al., 2015). The train set is balanced concerning the minority class (AMR-patients), using the remaining data in the test set (non-AMR patients). We apply the TCK to this dataset (freely available Matlab code in (Mikalsen, 2017)

), considering the maximum number of mixtures component for each Gaussian Mixture Models to be 40, and the number of randomizations for each number of components equals 30.

Figure 2. Schematic description of the methodology followed in this paper.

Dimensionality reduction and visualization.

To visually evaluate the potential of TCK as a kernel when dealing with MTS, we benchmark PCA with TCK, KPCA with TCK and AE with TCK. Note that for PCA, we decide to capture 99% of the information of the original space, ending up in 16 principal components. For KPCA, we consider a polynomial kernel, 50 principal components and a gamma value of 0.002083. These hyperparameters are tuned based on the minimum mean square error between the original and the compressed space obtained in the validation set. The same criteria are applied for AE, for which a leakyRelu activation function is used, except for the last layer, where a sigmoid is considered. The minimum mean squared error was used as the loss function. The AE is trained for 1000 epochs with an Adam optimizer and exponential learning rate decay. Several simple and deep AE are evaluating, showing that considering 712 hidden neurons and 250 neurons in the compressed space is the best architecture to identify AMR patients. Keras in Tensorflow has been used for this implementation.

Figure 3. Visualization of AMR and non-AMR patients using t-SNE representations after reduction of the TCK space with PCA (first column), KPCA (second column) and AE (third column).
Figure 4. Percentage of AMR patients within and out of the cluster (a); and percentage of non-AMR patients (b).

The new representations spaces are considered as input to t-SNE, aiming to visualize patients in two dimensions. These visualizations are shown in Figure 3 for (a) for PCA, in Figure 3 (b) for KPCA, and in Figure 3 (c) for AE. The learning representations provides knowledge for AMR patient identification. A distinguishable cluster (colored mainly in green) is observed in Figures 3 (a), (b) and (c), composed by 157, 157 and 161 patients, respectively. The patients grouped in the cluster in Figures 3 (a) and (b), are part of the patients observed in the cluster shown in Figure 3 (c). It is important to highlight that, in this cluster, the majority (139) are patients with AMR detected in the first 48 hours of ICU stay, of whom 61.87% required mechanical ventilation, compared to 76.68% of AMR patients not within the cluster. In this line, AMR patients outside the cluster require more antibiotic treatments (see Figure 4 (a) for details). This may support that their health status is more critical. Furthermore, it can be observed that, in general, non-AMR patients take fewer antibiotics than AMR patients, except for families of antibiotics such as PEN and CF3 (see Figure 4 (a) and (b) for details).

Classification results.

The learned representations by PCA, KPCA and AE are used as the input of different linear and non-linear classifiers, specifically, LR, k-NN, decision tree, random forests, SVM, nu-SVM and MLP. The metrics used to measure the performance of the classifiers are accuracy, specificity, sensitivity, and area under the curve (AUC). To tune the hyperparameters, a 5-fold cross-validation strategy was considered in the training set. Results in the test set are shown in Table 1. Note that, in general, AE is the most adequate DR approach. It can also be observed that linear classifiers perform well in terms of sensitivity and AUC, whereas non-linear classifiers, such as nu-SVM, provide better accuracy and specificity results.

DR Method Classifier Accuracy Specificity Sensitivity AUC
PCA LR 53.49 51.73 76.8 64.26
k-NN 64.91 64.31 72.93 68.62
Tree 57.47 58.64 41.99 50.32
Random forest 65.92 67.47 45.3 56.39
nu-SVM 57.71 56.77 70.17 63.47
SVM 58.48 57.43 72.38 64.91
MLP 51.74 49.65 79.55 64.60
KPCA LR 51.7 50.06 73.48 61.77
k-NN 62.97 62.52 69.06 65.79
Tree 58.46 60.43 32.23 46.33
Random forest 70.95 71.68 61.33 66.5
nu-SVM 57.32 58.31 44.2 51.25
SVM 54.8 53.44 72.93 63.18
MLP 55.61 54.47 70.71 62.59
AE LR 82.26 82.8 75.14 78.97
k-NN 79.86 81.8 54.14 67.97
Tree 75.95 77.01 61.88 69.44
Random forest 77.85 79.13 60.77 69.95
nu-SVM 82.92 83.59 74.03 78.81
SVM 75.99 76.51 69.06 72.79
MLP 80.94 81.92 67.95 74.93
Table 1. Classification results provided for different DR and classifier in terms of accuracy, specificity, sensitivity and AUC.

4. Discussion and conclusion

This work presents a promising approach for early identification of AMR patients in the ICU based on MTS recorded in EHR. The following are some of our contributions:

  • A time series cluster method is created to find similarity measures for MTS with missing data.

  • Compressed representations that preserve pairwise relationships allows clinicians to visually identify the acquisition of AMR in the ICU.

  • Classification results considering the learning representation as input space suggest that the proposed methodology can be used for earlier detection of AMR.

Learning compressed representations of the TCK space based on linear and non-linear approaches provides promising visualization for identifying a specific group of AMR patients who acquired the AMR during the first 48 hours of their stay in the ICU. This allows anticipating the culture results and taking isolation measures to avoid further spreading to other patients in the unit. The experimental results also provide good classification capabilities, bringing some light to the antibiotic treatment used to treat AMR patients.

The potential of deep autoencoders in this study opens the way for exploring more complex AE such as denoising or variational autoencoders (Doersch, 2016). Future work also includes the possibility of considering this problem as a multiclass classification problem rather than a binary one, aiming to distinguish between AMR detected in the first 48 hours, AMR detected later and non-AMR patients.

5. Acknowledgments

This work has been partly supported by the Spanish Research projects PID2019-107768RA-I00 (AAVis-BMR), PID2019-106623RB-C41 (Beyond), DTS17/00158, and Project Ref. F661 (Mapping-UCI)- by the Community of Madrid and the Rey Juan Carlos University.


  • F. Anowar, S. Sadaoui, and B. Selim (2021) Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, le, ica, t-sne). Computer Science Review 40, pp. 100378. Cited by: §2.2.
  • C. M. Bishop (2006) Pattern recognition and machine learning. springer. Cited by: §2.2.
  • R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad (2015) Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. pp. 1721–1730. Cited by: §3.
  • C. Chatfield (2003) The analysis of time series: an introduction. Chapman and Hall/CRC. Cited by: §2.2.
  • D. Chushig-Muzo, C. Soguero-Ruiz, A. Engelbrecht, P. D. M. Bohoyo, and I. Mora-Jiménez (2020)

    Data-driven visual characterization of patient health-status using electronic health records and self-organizing maps

    IEEE Access 8, pp. 137019–137031. Cited by: §1.
  • C. Doersch (2016) Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908. Cited by: §4.
  • R. Garcia-Carretero, L. Vigil-Medina, I. Mora-Jimenez, C. Soguero-Ruiz, O. Barquero-Perez, and J. Ramos-Lopez (2020) Use of a k-nearest neighbors model to predict the development of type 2 diabetes within 2 years in an obese, hypertensive population. Medical & biological engineering & computing 58 (5), pp. 991–1002. Cited by: §1.
  • B. M. Marlin, D. C. Kale, R. G. Khemani, and R. C. Wetzel (2012) Unsupervised pattern discovery in electronic health care data using probabilistic clustering models. In Proceedings of the 2nd ACM SIGHIT international health informatics symposium, pp. 389–398. Cited by: §2.2.
  • K. Ø. Mikalsen (2017) Time series cluster kernel (tck) matlab implementation. External Links: Link Cited by: §3.
  • K. Ø. Mikalsen, F. M. Bianchi, C. Soguero-Ruiz, and R. Jenssen (2018) Time series cluster kernel for learning similarities between multivariate time series with missing data. Pattern Recognition 76, pp. 569–581. Cited by: §1, §2.2.
  • K. Ø. Mikalsen, C. Soguero-Ruiz, F. M. Bianchi, A. Revhaug, and R. Jenssen (2021) Time series cluster kernels to exploit informative missingness and incomplete label information. Pattern Recognition 115, pp. 107896. Cited by: §2.2.
  • J. M. Munita and C. A. Arias (2016) Mechanisms of antibiotic resistance. Virulence mechanisms of bacterial pathogens, pp. 481–511. Cited by: §1.
  • W. H. Organization et al. (2014) Antimicrobial resistance global report on surveillance: 2014 summary. Technical report World Health Organization. Cited by: §1.
  • F. Prestinaci, P. Pezzotti, and A. Pantosti (2015) Antimicrobial resistance: a global multifaceted phenomenon. Pathogens and global health 109 (7), pp. 309–318. Cited by: §1.
  • C. Soguero-Ruiz, W. M. Fei, R. Jenssen, K. M. Augestad, J. R. Álvarez, I. M. Jiménez, R. Lindsetmo, and S. O. Skrøvseth (2015) Data-driven temporal prediction of surgical site infection. In AMIA Annual Symposium Proceedings, Vol. 2015, pp. 1164. Cited by: §2.2.
  • C. Soguero-Ruiz, K. Hindberg, I. Mora-Jiménez, J. L. Rojo-Álvarez, S. O. Skrøvseth, F. Godtliebsen, K. Mortensen, A. Revhaug, R. Lindsetmo, K. M. Augestad, et al. (2016) Predicting colorectal surgical complications using heterogeneous clinical data and kernel methods. Journal of biomedical informatics 61, pp. 87–96. Cited by: §1, §2.2.
  • L. Van der Maaten and G. Hinton (2008) Visualizing data using t-sne.. Journal of machine learning research 9 (11). Cited by: §2.2.
  • P. Vincent, H. Larochelle, Y. Bengio, and P. Manzagol (2008)

    Extracting and composing robust features with denoising autoencoders

    In Proceedings of the 25th international conference on Machine learning, pp. 1096–1103. Cited by: §2.2.
  • X. Wang, A. Mueen, H. Ding, G. Trajcevski, P. Scheuermann, and E. Keogh (2013) Experimental comparison of representation methods and distance measures for time series data. Data Mining and Knowledge Discovery 26 (2), pp. 275–309. Cited by: §2.2.
  • World Health Organization (2015) Global Action Plan on Antimicrobial Resistance. (), pp. 28. Cited by: §1.