A key objective of the monitoring devices in ICUs is to constantly monitor patients’ heart function to diagnose any life-threatening arrhythmia. An ECG signal measures the electrical activity of the heart and is known as an important tool in diagnosing different heart conditions, such as cardiac arrhythmia, ventricular hypertrophy, and myocardial infarction. In spite of many well-developed methods to detect abnormal rhythms, the ICUs still suffer from significantly high false alarm rates due to different reasons including complex nature of signal patterns for some arrhythmia, motion artifacts, noise, sensor detachment, and loose threshold settings of the monitoring devices . Such high false alarm rates negatively impact both patients and medical staff through desensitizing the medical staff to true alarms and increase the response time, rising to an issue commonly referred to as alarm fatigue. Also, the frequent audio disturbance generated by false alarms can lead to sleep deprivation and depressed immune systems for the patients [2, 3].
Several research and clinical studies aimed at reducing the number of false alarms, ranging from expert systems that define several rules based on expert experiences to machine learning methods[4, 5, 6, 7]
. One common drawback of such methods is their unstable performance on different datasets, meaning that they can show a promising performance on some datasets while presenting a poor performance on others, suggesting that the success of such methods highly depends on the characteristics of the data set used for training. While machine learning methods lead to better generalization and performance compared to expert system methods, there are several limitations in using typical machine learning methods in time-series data such as longitude ECG signals. A typical challenge in these methods is the chance of overfitting and inaccurate performance when dealing with a large number of noisy, redundant and correlated features extracted from time-series signals or their transform domains (e.g. wavelet). Feature selection / reduction methods attempt to reduce the large set of input features to the most salient ones, but they may result in discarding important features, thereby missing some meaningful patterns in the signals[8, 9, 10].
One distinction of the biomedical signals with other time-series datasets is the periodic / semi-periodic behavior of such signals while the typical machine learning approaches are not capable of capturing this characteristic. In this paper, we took advantage of this property and developed an unsupervised feature learning method that creates a set of low-dimensional features for each subject that captures important characteristics of the underlying patterns in high-dimensional time-series input data by putting an extra attention on the abnormal portions of the signal. In other words, this method constructs a set of higher level features that better captures the underlying patterns related to different alarm types by processing each patient’s signal segment by segment.
Representation learning (feature learning) is one of most recent trends in machine learning that can improve the performance of machine learning methods by focusing on automatic discovery of features obtained from the raw data sets. Feature learning methods can be categorized into two groups of supervised and unsupervised learning. In supervised feature learning, the labels of input data are used to learn the feature representation step and train a method for classification. In unsupervised feature learning, the input data without its labels is used to learn feature representation [12, 13, 14]. Unsupervised feature learning has been used in time-series data [15, 16]; however, the current approaches have not provided the expected performance in ECG analysis yet, mostly due to presence of a wide set of diverse patterns related to different cardiac events. On the other hand, lack of availability of annotated long ECG recordings limits the application of supervised feature learning methods.
To the best of our knowledge, the previously reported unsupervised feature learning methods for ECG analysis are based on deep learning. In 
, clustering is used to find the most representative beats for training recurrent neural networks. In, the authors used aligned heartbeats for deep neural network to learn features and classification.
In this paper, an unsupervised feature learning method is proposed that takes the segmented unannotated ECG signals as input, clusters these segments to learn the relationships among these segments, and then uses the resulting clusters toward learning high-level features. Specific to the detection of false alarms, the algorithm first segments the ECG signal of each patient based on its heartbeats. These segments contain important features that represent the characteristics of the major ECG components such as P wave, QRS complex, and T wave. Then, the extracted segments of each patient’s ECG signal are clustered into several clusters. Each cluster represents a group of segments that are the most similar to each other. In fact, we build a bidirectional top-down and down-top feature learning by multi-resolution features that helps to better focus on patterns from toppest to lowest ones and vice versa (lowest to toppest). In the lowest to toppest direction, it provides a framework to capture non-linear relations, differences and similarities among those local features in the higher resolutions (i.e., segments and clusters).
It is worth noting that in this method we only utilize one-lead ECG signal collected from the patients and achieved comparable results to the methods that utilized all collected signals available in 2015 PhysioNet challenge data set (i.e. ECG lead II, ECG lead V, arterial blood pressure (ABP) or photoplethysmogram (PPG). This fact suggests the potential capability of this method to be used for cardiac event detection in wearable heart monitoring systems, since the majority of these remote monitoring systems such as Holter monitor only collect one lead ECG signal.
Ii Database Description
In this paper, a publicly available database for ‘Reducing False Arrhythmia Alarms in the ICU’ provided by PhysioNet/Computing in Cardiology Challenge 2015 [20, 21] is used. Each recording includes one or two ECG leads and one or more pulsatile waveforms, such as ABP and PLETH. This dataset focuses on five life-threatening arrhythmia alarms of Asystole (ASY), Extreme Bradycardia (EBR), Extreme Tachycardia (ETC), Ventricular Tachycardia (VTA), and Ventricular Flutter/Fibrillation (VFB). The training set contains 750 recordings, and the test dataset is not available to public; therefore, we used the training dataset for both training and testing purposes by utilizing k-fold cross validation. This study focuses only on the ECG lead II signal as this lead is the only recording available for the majority of patients. The ECG signals are 5 minutes in length and have been re-sampled at the rate of 250 Hz. Out of the 750 subjects, 29 of them were removed since they did not include ECG lead II signal. Table I shows the statistics of the numbers of true and false alarms for each arrhythmia type considered in this study.
|Alarm||# of Patients||False Alarm||True Alarm|
The majority of methods for signal processing that are based on time-series analysis or transform-based techniques handle the entire collected signal in the same way, while empirically the abnormal portions of the signals often contain more information about the event of interest. A common feature of several biomedical signals is the existence of a basic periodic pattern that can help distinguish between a normal- and an abnormal condition of a physiological system. For instance, in ECG signal analysis, the periodic normal ECG signal reveals several basic information about the heart function such as the heart beat, while the abnormal ECG can help in diagnosis of several arrhythmias. The multi-resolution feature extraction method presented below leverages this difference between the normal and abnormal portions by extracting additional higher resolution features from the abnormal sections, thereby enhancing the accuracy of arrhythmia detection process.
Figure 1 provides a schematic diagram of the overall method. First, the ECG signals are segmented by performing peak detection and decomposing each signal into P-T segments. Then, a set of preliminary features are extracted from each segment as described in Table II. After that, the detected segments for each patient are clustered into several clusters using the -means algorithm, and a set of multi-resolution features are learned from each cluster in an unsupervised way.
-means algorithm is used for clustering as it provides a fast and robust performance in most applications. Finally, these new constructed features are normalized to classify the alarms as true or false.
Iii-a Beat Detection and Signal Segmentation
There are several techniques to segment the ECG signals and extract the beat-to-beat intervals. Pan Tompkins algorithm is one of the most common, and low-computational methods to detect the QRS complex in the ECG signals . Most of the proposed methods for ECG segmentation are highly sensitive to several disturbances such as noise, interference, and motion artifacts. The objective of the proposed method is to decrease the number of false alarms due to these abnormalities in the signal. Let us consider three general cases of possible segments in the ECG signal including the normal cycles, arrhythmia segments and the abnormal ones (i.e., noisy segments, the irregular segments due to inaccurate segmentation, or the ones affected by sudden changes in QRS amplitudes). The proposed feature learning technique can significantly degrade the impact of inaccuracy in segmentation, noise, and abrupt changes in alarm detection, since these distortions lead to different behavior in segments compared to the known arrhythmia patterns. During the clustering phase, different abnormal segments due to the aforementioned sources of distortions are more likely to be clustered into separate clusters. Then, the classification technique can recognize such abnormal ones by learning the relations between the labels and the extracted clusters. We should note that the proposed feature learning algorithm is a generic method and can be applied over any segmentation techniques such as Hamilton-Tompkins and Hilbert transform-based algorithm. After detecting R-peaks, the presence of other waves in the signal including P, Q, S and T are detected using adaptive searching windows for each peak. Then, each segment is identified from the onset of its P wave to the offset of its T wave. Figure 2 illustrates an ECG signal annotated with R-peaks, P-, QRS-, and T-waves.
Iii-B Preliminary Feature Extraction
As mentioned earlier, the entire 5-minute-long recordings of ECG lead II are utilized during feature extraction. Preliminary features are extracted from each segment of the given ECG signal in the time domain. These preliminary features can be categorized into three groups, 1) - and -coordinates of each present waves (i.e., P, Q, R, S, and T waves) of the signal, for example and ; 2) the intervals between the beginnings of different waves, for example, RR intervals; and 3) the intervals of the amplitudes between the waves or RR amplitude intervals. The set of extracted ECG-related features (84 in total) are provided in Table II. The values in each segment are defined as the relative distance to the location of -wave as a reference point. The -value refers to the amplitude of the ECG signal. In the first category of primary features, the of P,Q,R,S, and T are extracted for each segment. For example, and refers to and coordinate of the P wave. In the second category of primary features, notation refers to the average of values of P and Q waves and the average of values of S and T is defined as . Notations and are the corresponding values of ECG signal for and , respectively. In the third category of primary features, the interval (difference between the x values of two waves) are measured. For example, means the difference between the value of the P of the current segment and the value of the P of the next segment. interval means the distance between the value of R of the current segment and value of R-peak of the next-but-one segment. In the fourth category of primary features, the difference of the -values of two peaks are measured. For example, means the difference between the value of the R-peak of the current segment and the value of the R-peak of the next segment. means the difference between the value of R-peak of the current segment and the R-peak of the next-but-one segment.
Iii-C Multi-resolution Unsupervised Feature Learning
The proposed method learns the patterns of different heart arrhythmias through several high-level features, which are learned thorough unsupervised feature learning. The key contribution of the proposed method is to put attention on abnormal portions of the longitude ECG signals, either arrhythmia or distorted portions, through clustering. When the clustering approach is applied on the segments, the center of each cluster is a representative of the segments that belong to that cluster. Then, the proposed method assigns different weights to the formed clusters based on the number of segments within each clusters. These weights represent the probability of a cluster occurrence. The high-level learned features are built based on the position of the centers of these clusters as well as the number of their encountered segments.
The extracted segments of the signals are the units that can provide micro resolution frame of the signal to distinguish the encountered abnormalities in different clusters. Let us denote a cluster with as a set of segments as , with . The segments’ clusters provide a macro (higher-level) representation of the signals in which the center of each cluster represents the behavior of a set of segments that are most similar to each other. Therefore, the center of a cluster, centroid, is used as the representative of encountered segments. The number of features of the center of clusters is the same as the number of features of each segment (i.e., 84). The center of cluster is denoted with in which . The number of clusters’ elements, the center of clusters, and the distance among these centers are defined as high-level features that capture the relations among different signals. For instance, when the number of elements of a cluster is relatively small and its centers is far from other clusters for a patient, and the clusters similar to this cluster are frequently seen in other patients signals, this cluster likely represents an arrhythmia. Therefore, these high-level features are used to construct a low-dimensional set from the high-dimensional low-level features of segments.
The preliminary features of centers are normalized between 0 and 1 for each cluster. Also, the clusters of each patient are ordered based on the number of their members in an ascending order. The high-level features can be categorized in four categories as follows:
The patient’s heart rate and alarm type (i. e., Tachycardia, Asystole, Ventricular Flutter Fib, Ventricular Tachycardia, and Bradycardia).
The number of elements of each cluster () in an ascending order. For example, for the case of having two clusters with , there are two features in the following order order and .
The ratio of summation of features’ values for the centers of each cluster to the number of elements in that cluster based on the order . For example, if there are two clusters, we will have two values in the following order of and , assuming that .
The ratio of summation of features’ values for the centers of each cluster to the number of all segments of that patient based on the order, . For instance, for the case of two clusters, we will have two values in the following order of and , assuming that .
Iii-D Model Building
The learned high-level features of each signal are used in two different scenarios as the input for the classification phase. Here, we used two well-known classification methods of Boosted Trees and RUSBoosted Trees from the MATLAB’s Classification Learner APP since they are based on ensemble learning and provide robust results . However, the proposed method is independent of the choice of the classification technique and the extracted high-level features can be fed to any classifiers.
In the first scenario, only the unsupervised learned features are used as the input. In the second scenario, the unsupervised learned features are added to the features that are extracted using discrete wavelet transform, as described in Section IV. We also evaluated the proposed method using two different distance measures. As shown in Section IV, we achieved a great performance in alarm classification by only using one signal of ECG lead II and even by using a few number of high-level features that indicates the capability of this method in arrhythmia detection.
Iv Experiment Results
|DWT||DWT +||DWT +|
|DWT||DWT +||DWT +|
The performance of this method is evaluated using the 2015 PhysioNet computing in cardiology challenge dataset [20, 21]. We should note that in this study we only used the ECG lead II to extract the features as this signal was the most common available recordings in the database. However, the majority of existing studies on this dataset used all available signals for the patients including ECG II, ECG V, PPG, and ABP (for many patients only two of these signals are available) and yet our proposed method achieves a comparable performance related to these works. We would also like to note that the winner projects of 2015 Physionet challenge reported the results based on using the public portion of the data set for training and the private part for test, however, in this work, the public training set is used for both test and training with K-fold cross-validation.
To evaluate the performance of our proposed method, several evaluation metrics such as accuracy, sensitivity, specificity, and area under the ROC curve (AUC) of the proposed method are reported for the two aforementioned scenarios. The results of this method is compared against two common methods of time-domain and wavelet-domain analysis of the ECG signal
. In the wavelet-based method, the discrete wavelet transform (DWT) is applied on the entire 5-minute ECG lead II recordings, where a 6-level Daubechies 8 (db8) wavelet is used since there is a good match between its shape with the shape of the ECG signal. Because feeding all wavelet coefficients as features into the classification algorithm can result in over–fitting; we reduce the number of features by extracting 20 representative statistical and information–theoretic features from each level of the wavelet vectors as summarized in Table 2 in. In this case, the overall number of wavelet-based features is . We also compared the performance of the proposed method against a scenario, where the low-level features per segments are extracted for the last seconds of the ECG recording, as the data set identifies that the cardiac events have occurred within the 10 seconds of the signal. Since these 10-second recordings include a different number of segments for each patient, we considered seven last segments of the signal as an average number of observed segments for the last 10-seconds and extracted the -features per segment for these last segments (total of low-level features). We should note that one can consider a longer interval of the signal in time domain which results in a larger number of low-level features and can result in over-fitting.
Tables III and IV present the accuracy, sensitivity, specificity and the AUC of two classification techniques of Boosted Tree and RUSBoosted Trees, respectively, for five scenarios including, i) using the low-level features extracted from the last 10 seconds of the signal, ii) using the statistical and information-theoretic features extracted from a 6-level DWT over the entire 5 minutes ECG lead II signal, iii) only using the proposed high level features when the distance metric of -means clustering is cityblock, iv) only using the proposed high level features when the distance metric of -means clustering is Squared Euclidean distance, and v) both the proposed high-level learned features as well as the DWT-based features. The number of clusters in the -means clustering is considered as five. As the results show, Boosted Trees working with only the proposed features with distance metric of cityblock provides better performance in nearly all measures in comparison to using only the DWT-based features as well as using both sets of features high-level features and the DWT-based ones together (due to potential redundancy between the two sets), and also compared to the case of using squared Euclidean distance metric.
In summary, the proposed method obtains a better results compared with the common wavelet-based and time-domain based methods by only high-level features since the unsupervised learned features have better discrimination powers while they need less much computation resources than commonly used methods.
In this paper, an unsupervised feature learning approach is proposed for analysis of periodic/semi-periodic biomedical signals. This method works based on a clustering approach and learns a few number of high-level features from several formed clusters of time-domain segments of the signals. The learned high-level features can handle the various patterns and variations in ECG signals in a autonomous and scalable way by splitting the signals to their constituent segments and representing the discrimination of the segments by clustering. In this paper, the proposed feature learning technique has been applied on a single lead ECG signal available in 2015 PhysioNet challenge to distinguish the patterns in ECG signal associated to several arrhythmia’s from other potential distortions in ECG signals due to noise, interference, motion artifacts and other source of disturbance in collected ECG signals. For this purpose, we applied the proposed method on the entire 5-minutes available ECG recordings to enable the model to learn both normal and abnormal patterns in the collected signal per patient. As seen in the experimental results, the proposed method is capable of achieving a better performance compared to common false alarm detection based on DWT and a time-domain analysis- using the last 10 seconds of the signal that include the event- by only using a few number of high-level features and a low level of computations.
-  S. T. Lawless, “Crying wolf: false alarms in a pediatric intensive care unit.” Critical care medicine, vol. 22, no. 6, pp. 981–985, 1994.
-  Y. Donchin and F. J. Seagull, “The hostile environment of the intensive care unit,” Current opinion in critical care, vol. 8, no. 4, pp. 316–320, 2002.
-  M. Imhoff and S. Kuhls, “Alarm algorithms in critical care monitoring,” Anesthesia & Analgesia, vol. 102, no. 5, pp. 1525–1537, 2006.
-  Y. Zhang and P. Szolovits, “Patient-specific learning in real time for adaptive monitoring in critical care,” Journal of biomedical informatics, vol. 41, no. 3, pp. 452–460, 2008.
-  Q. Li and G. D. Clifford, “Signal processing: False alarm reduction,” in Secondary Analysis of Electronic Health Records. Springer, 2016, pp. 391–403.
-  F. Schmid, M. S. Goepfert, F. Franz, D. Laule, B. Reiter, A. E. Goetz, and D. A. Reuter, “Reduction of clinically irrelevant alarms in patient monitoring by adaptive time delays,” Journal of clinical monitoring and computing, vol. 31, no. 1, pp. 213–219, 2017.
-  R. He, H. Zhang, K. Wang, Y. Yuan, Q. Li, J. Pan, Z. Sheng, and N. Zhao, “Reducing false arrhythmia alarms in the icu using novel signal quality indices assessment method,” in Computing in Cardiology Conference (CinC), 2015. IEEE, 2015, pp. 1189–1192.
-  Z. M. Hira and D. F. Gillies, “A review of feature selection and feature extraction methods applied on microarray data,” Advances in bioinformatics, vol. 2015, 2015.
-  F. Afghah, A. Razi, and K. Najarian, “A shapley value solution to game theoretic-based feature reduction in false alarm detection,” Neural Information Processing Systems ( NIPS), Workshop on Machine Learning in Healthcare, arXiv:1512.01680 [cs.CV], Dec. 2015.
F. Afghah, A. Razi, R. Soroushmehr, H. Ghanbari, and K. Najarian, “Game theory for systematic selection of wavelet-based features; application in false alarm detection in intensive care units,”Entropy, Special Issue on Information Theory in Game Theory,, vol. 20, no. 3, p. 190, 2018.
-  D. Oglic and T. Gärtner, “Greedy feature construction,” in Advances in Neural Information Processing Systems, 2016, pp. 3945–3953.
A. Coates, A. Ng, and H. Lee, “An analysis of single-layer networks in
unsupervised feature learning,” in
Proceedings of the fourteenth international conference on artificial intelligence and statistics, 2011, pp. 215–223.
A. Coates and A. Y. Ng, “Learning feature representations with k-means,” inNeural networks: Tricks of the trade. Springer, 2012, pp. 561–580.
-  R. Goroshin, J. Bruna, J. Tompson, D. Eigen, and Y. LeCun, “Unsupervised feature learning from temporal data,” arXiv preprint arXiv:1504.02518, 2015.
-  M. Längkvist, L. Karlsson, and A. Loutfi, “A review of unsupervised feature learning and deep learning for time-series modeling,” Pattern Recognition Letters, vol. 42, pp. 11–24, 2014.
-  Q. Zhang, J. Wu, H. Yang, Y. Tian, and C. Zhang, “Unsupervised feature learning from time series.” in IJCAI, 2016, pp. 2322–2328.
A. Eduardo, H. Aidos, and A. Fred, “Ecg-based biometrics using a deep autoencoder for feature learning,” 2017.
-  C. Zhang, G. Wang, J. Zhao, P. Gao, J. Lin, and H. Yang, “Patient-specific ecg classification based on recurrent neural networks and clustering technique,” in 2017 13th IASTED International Conference on Biomedical Engineering (BioMed). IEEE, 2017, pp. 63–67.
-  S. S. Xu, M.-W. Mak, and C.-C. Cheung, “Towards end-to-end ecg classification with raw signal extraction and deep neural networks,” IEEE journal of biomedical and health informatics, 2018.
-  PhysioNet, Reducing False Arrhythmia Alarms in the ICU, 2015, accessed July 28, 2016. [Online]. Available: http://www.physionet.org/challenge/2015/
-  G. D. Clifford, I. Silva, B. Moody, Q. Li, D. Kella, A. Shahin, T. Kooistra, D. Perry, and R. G. Mark, “The physionet/computing in cardiology challenge 2015: reducing false arrhythmia alarms in the icu,” in Computing in Cardiology Conference (CinC), 2015. IEEE, 2015, pp. 273–276.
-  J. Pan and W. J. Tompkins, “A real-time qrs detection algorithm,” IEEE transactions on biomedical engineering, no. 3, pp. 230–236, 1985.
-  MathWorks, “MATLAB’s Classification Learner APP,” 2018. [Online]. Available: https://www.mathworks.com/help/stats/classificationlearner-app.html
-  R. Salas-Boni, Y. Bai, P. Harris, B. Drew, and X. Hu, “False ventricular tachycardia alarm suppression in the icu based on the discrete wavelet transform in the ecg signal,” Journal of electrocardiology, vol. 47, 08 2014.