Nowadays the surface electrocardiogram (ECG) is recognized as an invaluable tool for monitoring heart condition, since its analysis provides decisive information that can reveal critical deviations from normal cardiac behavior. Recent developments in mobile sensors and mobile computing have enabled new scenarios for continuous ECG monitoring as an inexpensive tool for the early detection of some cardiac events, especially in those cases where symptoms appear intermittently.
As the monitoring period increases, the interpretation task becomes more time consuming and decision-support tools are needed to help cardiologists to reduce the time spent on it. If a continuous follow-up is required, these tools become imperative. Their main aim is to provide the cardiologists with a summary of all the acquired signals, enhanced with a fast locating of those anomalies detected.
Cardiac arrhythmias are the most relevant among the ECG findings. There are two main sources of arrhythmias: an automatism disorder, that is, a set of alterations in the beat activation point due to changes in its location or activation frequency; or a conduction disorder, that is, an abnormal propagation of the beat wavefront through the cardiac tissue. They both have an effect on the ECG, affecting the beat morphology and/or beat rhythm. In order to support their identification, a method for separating the beats by their activation point and conduction pattern should be provided.
Beat classification arises as the task of assigning each beat in an ECG a label identifying its physiological nature. Machine learning techniques have been applied to this task by estimating the underlying mechanisms that produce the data of a training set. The main drawback of this approach is its strong dependence on the pattern diversity present in the training set. Thus, inter-patient differences show that it cannot be assumed that a classifier trained on data of a large set of patients will yield valid results on a new patient[2, 3, 4], and intra-patient differences show that this cannot be assumed even for the same patient throughout time. In addition, class labels only provide gross information about the origin of the beats in the cardiac tissue, loosing all the information about their conduction pathways. This approach does not distinguish the multiple morphological families present in a given class, as occurs in multifocal arrhythmias.
In contrast, beat clustering aims at dividing the ECG recording in a set of beat clusters, each one of them preserving some similarity properties. Previous proposals have focused on an offline approach, from a priori maximum number of clusters [5, 6, 7, 8] and they imply processing the ECG signal once the acquisition has been completed. This approach has given good noise robustness, but as a side effect a single morphology is usually replicated in several clusters and rare beat morphologies can be missed. It also omits the dynamic aspect of ECG and, in particular, ignores the temporal evolution of morphologies. Furthermore, the detection of critical events can be deferred too long to provide timely attention. For all these reasons, a dynamic online approach must be considered.
In this paper, we present a real-time method for adaptive beat clustering, with a potential application not only as a previous step for classification , but also as a summary about those beat morphologies present in a certain period, their temporal evolution and variability, or even to detect the presence of alternating morphologies. The proposed method emulates the experts behavior in exploiting the temporal context for assigning each new beat to the most appropriate cluster. To this end, clusters are continuously adapting to the temporal evolution of beat morphologies, and they can be dynamically created, merged or modified, resulting in a variable number of clusters.
Beat clustering requires extracting from the ECG a set of representative measurements for every beat. Bibliography shows a variety of proposals for beat representation that can be grouped into four categories: morphological features where the signal amplitude is directly used [9, 4, 2, 7]; segmentation features like area, amplitude or interval duration from beat waves [4, 7, 2, 10] or amplitude and angle values from vectorcardiogram ; statistical features derived using high order cumulants[11, 12, 4] and finally, transformed space features, defined in an alternative space using different transforms like Karhunen–Loewe transform , Hermite basis functions [5, 12, 4]
, discrete Fourier transform[14, 9] or wavelet transform [14, 3]. All the previous works complete their feature sets with rhythm information to complement their description capabilities.
The present paper proposes a new approach to represent a beat by reducing the QRS complex to a set of relevant points and support regions. This representation has some nice properties for beat clustering: it is stable against the usual variability and the presence of noise in the ECG, and it explicitly represents the temporal location of some QRS features.
The proposed method processes a real-time multilead ECG signal through a set of data-driven stages as shown in Fig. 1. In order to obtain comparable results, signals from two standard ECG signal databases are used as data source (section II). The pre-processing stage comprises real-time beat detection and baseline filtering (section III). Then, a fixed-length signal segment is selected for extracting and characterizing the QRS complex (subsection IV.A). QRS complexes are compared to the current set of clusters following a context-based criteria to obtain the best matching cluster (subsections IV.B–D). Afterwards, the current cluster set is updated in one of three ways: creating a new cluster, modifying the most similar one or merging two or more clusters (subsection IV.E). The next stage performs a noise analysis for each lead in order to detect noisy intervals and avoid the processing of noisy beats or discard the clusters created from them (section V). Finally, the beats are classified by their rhythm type and a set of groups with common morphology and rhythm is obtained (section VI).
Ii ECG Signal Databases
The ECG databases recommended by the ANSI/AAMI EC57  standard for reporting the performance of arrhythmia detectors were used for validation purposes: the MIT-BIH Arrhythmia Database and the AHA ECG database.
The MIT-BIH Arrhythmia Database  can be referred to as the golden standard for beat clustering and classification tasks and it is the reference database for almost all the literature in this field. This database is composed of 48 recordings of ambulatory ECG, obtained from 47 different patients which comprise a very complete set of examples of common and rare arrhythmias. Each record has a duration of 30 min, and includes two channels with the same leads in almost all of them: a modified-lead II (MLII) in the first one and lead V1 in the second one. MLII was replaced by lead V5 in three records and V1 was replaced by MLII, V2, V4 or V5 on eight records. The signals were digitized at Hz and bandpass filtered with cutoff frequencies at 0.1 and 100Hz. All beats present in the database were annotated by at least two expert cardiologists, and assigned a class label using a 16 label set.
The AHA ECG database was compiled by the American Heart Association and it is composed of 155 recordings of ambulatory ECG digitized at Hz containing the most relevant types of ventricular arrhythmias. Each record is three hours long with two channels but only the last 30 minutes have been manually annotated by experts.
The major drawback of processing long-term ECG signals is the presence of a high level of noise with multiple manifestations —baseline wandering, power line interference or electromyographic activity— so an initial filtering stage needs to be performed. The efforts have been focused on filtering the baseline wandering, as this is the most relevant source affecting the reliability of the clustering algorithm because of the distortion it can cause on the QRS morphology.
Iii-a Baseline filtering
The baseline filtering is performed through the estimation of the baseline wandering and its posterior elimination from the original ECG. To achieve this goal, each signal is processed by two sequentially connected median filters of 200ms and 600ms, respectively, as described in . A global delay of 400ms is added by this process, independently of the sampling rate.
Iii-B Beat detection
In order to carry out an evaluation of the clustering method separately from beat detector and provide a fair comparison framework for future algorithms, we used the beat position provided by the ECG databases as fiducial points.
A beat detector is required for real scenarios, affecting the quality of results. However, it virtually does not increment the global computational complexity as QRS detection represents only a small fraction of it. The global delay would not be affected either, since the beat detection can be performed concurrently with baseline filtering with a shorter delay.
In this paper the following notation is used. Bold face variables (e.g.,
) to represent vectors and sequences, lower case alphabets with subscripts to represent their components (e.g.,) or superscripts when a temporal index is used as a subscript (e.g., ), upper case alphabets (e.g., ) to represent sets, and calligraphic letters to represent functions (e.g., ).
The aim of the clustering method is to group in real-time those beats which share the same activation area and propagation pattern in the cardiac tissue. The propagation of the electrical impulse through the heart is reflected in the ECG as a sequence of waves corresponding to the activity in the atria and ventricles. P wave and QRS complex are of particular interest since they show the atria and ventricular depolarization, respectively, thereby providing a fingerprint of the conduction pathway. Since the noise level of ambulatory signals makes the detection of the P wave a very difficult task, we focus on the QRS morphology to characterize the beat. As a consequence, beats with different atrial activation points or even with nodal activation points can share the same QRS morphology, and will be assigned to the same cluster. In the absence of a P wave analysis, rhythm information can be useful in some cases to detect this situation, as explained in section VI.
The proposed method is intended to provide a real-time resume of the ECG signal as a dynamic set of clusters. Whenever a new beat is detected, the set of clusters is updated. Let denote a time series where is the set of clusters that represents those QRS morphologies which appeared from the beginning of the recording up to the beat , is the set of beats assigned to cluster and the number of clusters, both at beat .
The clustering follows a data-driven flow triggered by the beat detector as shown in Fig. 1. Each new beat is compared to the set of clusters using concordance and dissimilarity measures in order to find the best matching and giving preference to those in its temporal context. A detailed explanation of this process follows in the remaining subsections: first, a beat characterization and a template-based representation of the clusters; next, the alignment technique and the measures used to compare beats and clusters; afterwards, the criteria for cluster selection and, finally, the process of updating the set of clusters.
Iv-a QRS characterization
We adopt a strategy inspired by dominant point detection  to characterize the QRS morphology through its constituent waves. Let denote a time series which represents the multilead ECG signal, where is the number of leads and is the vector of samples at time for all leads. When the th beat is detected with fiducial mark at time , a fixed-length subsequence of samples ( before and after ) is selected from each lead to represent its QRS complex (see Fig. 2):
where . We set and , which are wide enough to capture the longest QRS of abnormal beats  (V, F and f beat types, typically).
Most of this section describes operations over one lead, so for the sake of notation simplicity the superscript will be obviated unless multiple leads are involved.
We define the curvature at as:
where . The terms and denote the intervals used for calculating the curvature, and they are given by:
where . The term is the maximum physiologically meaningful width of a QRS wave between its peak location and its left or right end, so that it is the upper limit of and . The term is the minimum height for a signal deflection to be considered physiologically relevant and, therefore, to be excluded from the calculation of curvature.
We define the dominance region of a sample as:
where for any fixed and for any fixed .
We define the set of dominant points of as:
We define the set of relevant points of as:
where is the minimum height for a signal deflection to be considered a relevant QRS wave, with . If , then it is redefined as: .
The limits of can be located at any point in the edge of a wave. Since we are interested in capturing its full extent, the support region of a relevant point is defined as:
where and are the sample numbers nearest to and where slope sign changes:
In consequence, adjacent support regions can now overlap.
A relevant point is said to be in a concave wave if and . Otherwise, it is said to be in a convex wave. The wave height is defined as .
Finally, the th beat is represented by the QRS signal segment and the set of its relevant points and support regions and denoted by:
There is not a consensus in the literature about the limits for width and height of QRS complex or individual QRS waves. The AAMI standard  recommends a minimum amplitude of and duration of ms for a QRS wave to be detected, and for peak-to-peak QRS amplitude, with a minimum duration of . On the other hand, the AHA  and CSE  report lower amplitude for QRS waves (down to and ms), based on measures over averaged beats with increased Signal-to-Noise ratio. These limits were not established for physiological reasons, but for signal noise level or instrumentation limitations. Nothing is stated about maximum QRS width beyond a reference to case-based duration values (e.g. the CSE study  reports a maximum QRS width of ms). In our case, due to the Signal-to-Noise ratio present in the ambulatory signals, the value of is set to 150V in order to avoid the detection of small waves caused by noise and the value of is set to ms to accept QRS waves with a maximum width of ms. The value of is set to following the AAMI standard  and will be useful to detect noise contaminated QRS complexes.
In order to perform the comparison between a new beat and the current set of clusters , each cluster is represented by a template:
where is derived from the QRS of the beats assigned to as will be explained in section IV-E.
Iv-B QRS temporal alignment
In order to compare a beat with a cluster template , a temporal alignment of and is performed using Dynamic Time Warping (DTW) . The aim of this alignment process is twofold. First, to correct any temporal misalignment due to a misplaced fiducial mark. And second, to reduce the contribution of the height and width variability of the QRS waves to the dissimilarity measure.
DTW was previously used for this purpose in , providing a relation between and called warping path, with and . Each represents the alignment of the index of with the index of under three conditions: first, and ; second, and ; and third, . These conditions preserve the time-ordering of points and prevent some point being missed in the alignment.
Let denote the cost function associated to a warping path defined by , where is the local cost function associated with each element of . The optimal warping path is the one that minimizes .
DTW aligns the original signal samples, so any component from where introduces a horizontal segment into the aligned signals. Since this can lead to unacceptable distortion of the QRS, we adopt the Derivative Dynamic Time Warping (DDTW)  which uses the estimation of the derivative instead of the signal itself. The derivative is approximated by the first difference: and with and .
We imposed some additional conditions on the selected warping path so as to restrict the alterations of aligned signals. A global restriction is set in the search process by defining a warping window (named Sakoe-Chiba band ) to limit the temporal distance between aligned samples so . We set , which corresponds to a distance of 14ms with Hz, and is long enough to deal with small misalignments of beat marks. A local restriction is also used to limit the number of times the same sample can be aligned, setting a slope constraint: . We set to limit the variation in the amplitude of the aligned signals. Both conditions together allows the DDTW to cancel out wave differences up to three times in amplitude and up to samples in width.
Finally, after the optimal warping path is found, the aligned signals and are obtained with coordinates and for , where and . Fig. 3 shows the result of the alignment of the current QRS complex and a template.
Iv-C Template matching
In order to assign a beat to an existent cluster , we design a similarity calculation that only considers the difference between signals over the support region of each relevant point, thus limiting the comparison to the constituent waves of the QRS. Given a pair , we are interested in verifying whether contains a similar wave in the same location of the QRS. In order to do so, the interval of aligned with the interval of must be obtained (see Fig. 4). To this end, the warping path is used to map and from into obtaining the equivalent index for the relevant point, and for its support region where: , and , being . Afterwards, since and are already aligned by the application of DDTW, the same interval from is selected and mapped into using the warping path. The interval is given by and .
Once the aligned intervals are obtained we proceed to evaluate the concordance of both vectors. The segment is said to concord with at and denoted by if contains a deflection in with height likely to be considered a significant waveform, where being , and for in a convex wave (see Fig. 4a). Otherwise, and functions are interchanged.
We define the concordance ratio of with respect to at , denoted by , as:
if . Otherwise, .
We define the local dissimilarity of with respect to at , and denoted by , as a weighted relative area difference between and at the interval :
The terms and represent the areas under over the intervals and , respectively (see Fig. 4b). Computing the area at each side of independently allows us to minimize the effect of vertical misalignment or amplitude variation on each interval by subtracting their own median. Using the trapezoidal rule:
where and .
The terms and represent the areas below and intervals of , respectively (see Fig. 4c). They are estimated using the trapezoidal rule:
where and for in a convex wave. Otherwise, is replaced by .
We define the piecewise similarity of with respect to , denoted , as the sum of two bounded contributions from the set of concordant and non-concordant relevant points:
where the sigmoid functionkeeps the contribution of each point in the interval . The parameter determines the decrease rate of the contribution as the local dissimilarity increases and, as a consequence, the weight of a high dissimilarity value in the final piecewise dissimilarity. We limit the contributions of those points out of a range of admissible local dissimilarity. In order to set such range we consider the effect of amplitude and temporal variability of QRS waves. A temporal misalignment of one signal sample between the QRS waves of and can lead to local dissimilarities of around 20%. Then we set an interval of with a slightly greater upper bound and so the maximum contribution out of this interval is .
The previous measure is asymmetric since it depends on the relevant points and areas of one signal, so we define the similarity between and as:
thus obtaining a value which captures the concordance, similarity and morphological complexity of both signal segments.
This measure allows us to select the most similar template within a set, but we need a reference scale to evaluate the degree of matching. To this end, we define the normalized piecewise similarity as:
and the normalized similarity as:
where is the set of relevant points of .
Iv-D Cluster selection
The occurrence of different QRS morphologies in a segment of ECG signal is usually limited by a reduced set of activation points and conduction pathways. Thereby we expect that the majority of QRS complexes in an ECG recording share their morphology with some of the QRS complexes present in a short previous temporal interval. For that reason, the search for the cluster that best matches a beat is first performed in the set of clusters present in its temporal context (see Fig. 5). We define the temporal context as the set of beats previous to , and . The context length is set to beats, the number of beats displayed in the typical 10s ECG strip used by cardiologists for a heart rate of 80 beats/min. This context is long enough to include every QRS morphology present in multifocal arrhythmias. Throughout this section, the superscript is used to denote the lead.
The similarity measure is used to identify the most similar template for each lead as and then the best matching cluster is obtained by majority vote as . If multiple clusters are selected, a second vote is performed to obtain a single one using the normalized similarity.
Beat is assigned to if the condition is fulfilled in all leads. Then . We set a value of which corresponds to the maximum contribution of a point with local dissimilarity outside the admissible interval.
When and are not similar enough, a new comparison is performed within the subset , obtaining the most similar cluster . If and are similar enough, the beat is assigned and . Otherwise, the beat is not assigned to any existing cluster and its most similar cluster is selected between and using the same voting criteria previously seen.
Iv-E Cluster set updating
In order to adaptively respond to the changing behavior of the ECG, clusters must be dynamically created, modified or merged whenever a new beat is detected:
Iv-E1 Cluster creation
If is not assigned to , a new cluster is created and its template is initialized for each lead using the beat representation: . Then the cluster set is updated as .
Iv-E2 Cluster modification
If is assigned to , then the cluster is updated to and the template is modified to for each lead. To this end, is calculated from and :
where . The term is the coefficient of the exponential update. Setting a value for implies a trade-off between plasticity and stability of the cluster template. We set so the last 16 beats assigned to the cluster provide 90% of the contributions to the current template. This allows the template to be adapted to the evolution of the QRS morphology. Afterwards, the set of relevant points and support regions is obtained from and assigned to the template: .
Iv-E3 Cluster merging
During the clustering process, different clusters can evolve to represent the same QRS morphology, so they should be merged. This situation is common when QRS complexes suffer from transient distortions in their morphology due to intrinsic variability, which makes them fall below the similarity threshold for their proper clusters. In this case, a new cluster is created which is subjected to an initial transient period during which the template for each lead can evolve getting closer to its most similar cluster.
In order to detect this situation, we define a relation which links each cluster with its most similar one among previous clusters. The relation is set for each new cluster as . As templates evolve with new assigned QRS complexes, the relation could have to be updated.
When is assigned to , the cluster is updated to and its most similar cluster may change. The relation is updated when multiple clusters in the same set than , be it either or , fulfill the assigment condition in all leads (see subsection IV-D). Since has been updated, getting more similar to , the best matching cluster for is selected within the remaining subset of clusters, where and the oldest of both clusters, or , is set as the new most similar cluster for the other. Then the clusters and are checked for possibly merging, since their templates have evolved to be similar enough to the last beat (see Fig. 5). The condition for merging those clusters will be analogous to the condition for beat assignment: , where , which corresponds to the maximum contribution of a point with local dissimilarity over 20%. The threshold value is increased with respect to since both signal segments are promediated templates with increased Signal-to-Noise ratio.
Afterwards, the special case of clusters within its transient period is considered. We establish the duration of this transitory state in terms of the number of assigned beats. Hence, if is assigned to and , the cluster is checked for merging with its closest one (see Fig. 5). We set as the minimum number of beats assigned to the cluster to confirm it represents an independent morphology and we consider enough for this purpose.
When two clusters and are merged, the cluster set, cluster template and relation are updated accordingly (we suppose that ):
is updated to .
is calculated by merging and :
where remains unmodified.
The template is modified to , where is the set of relevant points and support regions obtained of .
The relation is updated by removing the pair and replacing by in all the pairs where the former appears.
After the cluster gets merged, all the modified pairs of the relation are checked for merging. Afterwards, the cluster is also checked for merging with its closest one.
V Noise-cluster proliferation control
The dynamic creation of clusters entails the problem of identifying those QRS complexes contaminated by noise that could cause the proliferation of clusters. When noise appears in an ECG lead, the QRS can show different changes. In some cases, ECG segments with low Signal-to-Noise ratio present a high number of waves, and these can be detected by an abnormal number of dominant points. In other cases, domain knowledge is needed to discern if changes respond to a new QRS morphology or to a noisy version of a previous one. Some alterations are well known and described in literature, but others can be challenging even for an expert cardiologist, who compares every beat with those present in its temporal context in order to identify if it is related to a repetitive morphology change, a noisy interval or an isolated noisy beat. We follow a twofold, beat-based and context-based, approach to detect noisy QRS complexes. Noise can be present in one or more leads, and the influence of each lead in the clustering of a given beat will be analyzed in the following.
A state variable is defined to denote the existence of a noisy interval in lead containing the th beat. Such noisy interval begins when the first noisy beat is detected in that lead just after a sequence of previous noise-free beats. The beats contained within this interval can be either noisy or noise-free in and such condition will be represented as an attribute of the beat denoted by . During the noisy interval, the characterization of any new beat can represent not only QRS waves but also noise artifacts, so the cluster selection and assignment rules (subsection IV-D) are modified in lead : the dominant points of the beat are ignored, replacing by and by . Therefore, the condition for beat assignment is: