Log In Sign Up

Simultaneous 12-Lead Electrocardiogram Synthesis using a Single-Lead ECG Signal: Application to Handheld ECG Devices

by   Kahkashan Afrin, et al.

Recent introduction of wearable single-lead ECG devices of diverse configurations has caught the intrigue of the medical community. While these devices provide a highly affordable support tool for the caregivers for continuous monitoring and to detect acute conditions, such as arrhythmia, their utility for cardiac diagnostics remains limited. This is because clinical diagnosis of many cardiac pathologies is rooted in gleaning patterns from synchronous 12-lead ECG. If synchronous 12-lead signals of clinical quality can be synthesized from these single-lead devices, it can transform cardiac care by substantially reducing the costs and enhancing access to cardiac diagnostics. However, prior attempts to synthesize synchronous 12-lead ECG have not been successful. Vectorcardiography (VCG) analysis suggests that cardiac axis synthesized from earlier attempts deviates significantly from that estimated from 12-lead and/or Frank lead measurements. This work is perhaps the first successful attempt to synthesize clinically equivalent synchronous 12-lead ECG from single-lead ECG. Our method employs a random forest machine learning model that uses a subject's historical 12-lead recordings to estimate the morphology including the actual timing of various ECG events (relative to the measured single-lead ECG) for all 11 missing leads of the subject. Our method was validated on two benchmark datasets as well as paper ECG and AliveCor-Kardia data obtained from the Heart, Artery, and Vein Center of Fresno, California. Results suggest that this approach can synthesize synchronous ECG with accuracies (R2) exceeding 90 single-lead device can ultimately enable its wider application and improved point-of-care (POC) diagnostics.


CardioGAN: Attentive Generative Adversarial Network with Dual Discriminators for Synthesis of ECG from PPG

Electrocardiogram (ECG) is the electrical measurement of cardiac activit...

Synthesizing Stealthy Reprogramming Attacks on Cardiac Devices

An Implantable Cardioverter Defibrillator (ICD) is a medical device used...

Continuous User Authentication using IoT Wearable Sensors

Over the past several years, the electrocardiogram (ECG) has been invest...

Learning cardiac activation maps from 12-lead ECG with multi-fidelity Bayesian optimization on manifolds

We propose a method for identifying an ectopic activation in the heart n...

Representing and Denoising Wearable ECG Recordings

Modern wearable devices are embedded with a range of noninvasive biomark...

1 Introduction

Cardiac disorders remain the leading cause of mortality claiming more than 17.3 million lives annually [1]. An estimated 85.6 million people in the U.S. are living with cardiovascular diseases (CVD). Besides the elevated risk of mortality, CVD populations are confined to a much-degraded quality of life [2]. From a clinical standpoint, ECG tests remain the most common first step in the diagnosis of CVD. Synchronous 12-lead ECG is used to provide a non-invasive and a fairly definitive diagnosis of cardiac disorders. The strategic electrode placement, costly and bulky ECG equipment, and a trained personnel for diagnosis requires the patients to visit a primary care center. Since about 42.2 million CVD population in the U.S. constitutes of elderly people ( years of age) with several comorbidities [3, 4], frequent visits to the primary care can be a challenging task. Untimely diagnosis usually results in significant myocardial damage, reduced survival rate, and even mortality [5]. Not to mention that a fifth of the heart attacks are silent where the person becomes aware of the symptoms weeks or months later. Hence, for patients under higher risk, a timely and often a continuous diagnosis is required.

Recent advances in remote health, POC, and wearable technologies have allowed for continuous/ day-to-day remote monitoring. The electronic health data thus generated provides a novel opportunity for developing a data-driven, personalized decision support system for the healthcare providers. Further facilitating timely diagnostics to enhance the quality of life and reduce mortality risks among CVD populations. One such technology is single-lead handheld/wearable ECG devices [6, 7, 8]. These devices cost substantially less compared to the current clinical ECG recorders. They have significantly simplified recording ECG with a much easier application, access, and feasibility of everyday use. Consequently, they have caught the imagination of the medical community for transforming the clinical diagnostic practice. One such device, which we use in the current work, is the Alivecor-Kardia Mobile. It can gather ECG at 300Hz sampling rates, one-channel-at-a-time. These single-lead devices are beginning to be considered for detecting pathologies such as atrial fibrillation with promising initial results [9].

However, a major technological barrier hampers the applicability of the current POC ECG technologies for clinical diagnostic applications— medical practitioners are trained to parse patterns from 12-lead ECG signals to diagnose by correlating information from several leads. In fact, diagnosis of various forms of myocardial infarction and other acute cardiac conditions requires parsing patterns across two or more leads of synchronously acquired ECG [10] and/or through the use of vectorcardiogram (VCG) [11, 12, 13].

A significant amount of work has been done for reconstructing 12-lead ECG from fewer leads. These efforts include reconstructing the missing leads either from a different lead system such as VCG or EASI [14, 15] or from a reduced subset of standard ECG [16]. All of these approaches require at least two synchronously acquired leads to reconstruct the 12-lead ECG. Hence they are not applicable to the present context.

Attempts have been made to derive 12-lead ECG from single-lead portable devices [17]. These works are based on recording signals sequentially from the single-lead device, one lead at a time. Electrode placement configurations have been developed for recording every lead signal from these devices [18]. For example, Fig. 1 (a) shows a snippet of the simultaneous recording of three ECG signals, namely lead I, II, and III using a traditional ECG machine. In contrast, Fig. 1 (b) shows an example of an asynchronous ECG recording with a single-lead device. A preliminary effort has also been made to derive 12-lead ECG from single-lead by a weighted linear combination of asynchronous bipolar signals. A recent pilot study used sequential 12-lead ECG recording from AliveCor® heart monitor recorded by the trained research team for ST-elevation myocardial infarction (STEMI) diagnosis [19]. They demonstrated a potential for evaluation of acute ischemia using a single-lead device.

Figure 1: (a) Synchronous recording of ECG leads I, II, and III in clinical setting (b) recording lead I by placing the handheld device between a left and right arm or fingers during the time interval . Then the device setting is switched to make the right arm and the left are touch the electrodes to record lead II signals during , and subsequently switched to left arm-right leg during to record lead III signals. At most one lead signal is recorded at any given time.

However, the 12-lead signals derived from these earlier approaches are asynchronous. Their VCG analysis suggests that cardiac axis synthesized from these earlier attempts deviates considerably from that estimated from 12-lead and/or Frank lead measurements. Fig. 2 shows an example of reconstructed lead I ECG and 3D VCG derived using the inverse Dower transform [20] of ECG signals recorded asynchronously. As is evident from this figure, the VCG reconstructed from asynchronous signals deviates by more than 90°  from the measured signals. Consequently, every diagnostic method that employs VCG [13], as well as diagnostic algorithms that use timing across multiple leads would be ineffective and can lead to misdiagnosis of critical CVDs.

Figure 2: Comparison of ECG and VCG signals from measured (blue) signals versus those reconstructed from asynchronous (red) signals showing (a) significant differences in the timings of various ECG events of lead I, such as R peaks between the measured and the reconstructed signals. Note that the difference changes from beat to beat. (b) significant difference in the orientation of the QRS loop, and more than 90° deviation in the estimation of the cardiac axis.

Hence, an automated and accurate synchronous reconstruction of the 12-lead ECG from a single-lead ECG device can be a game-changer in cardiac diagnostics and personalized healthcare, especially to substitute expensive instruments with more affordable wearable devices. To our knowledge, none of the earlier methods can synthesize synchronous 12-lead ECG from a single-lead portable or wearable ECG device.

This paper presents a fully automated approach to derive synchronous 12-lead ECG from single-lead devices with accuracies exceeding 90% (in terms of ). The key innovation here is in our method to synchronize various lead signals by predicting the inter-lead temporal lags, i.e., the difference in the timings of various ECG events across multiple leads based on a non-parametric machine learning model. The present approach consists of two stages: The first stage consists of obtaining the characteristic morphological and temporal information of the missing lead. The second stage consists of capturing the dynamic evolution in terms of inter-lead temporal lags and amplitude variations in the lead being currently recorded and incorporating it in the missing lead synthesis. Eventually, by doing so the entire matrix of missing leads can be completed to derive a 12-lead standard ECG. The remainder of this paper is organized as follows: the methodological details of synthesizing 12-lead standard ECG from single-lead ECG signal are presented in Section 2. Section 3 presents details for the four datasets used in this paper to validate the efficacy of the proposed synthesis methodology. Subsequently, Section 4 presents the synthesis results for these datasets. Lastly, we conclude with discussions in Section 5.

2 Methodology

Synthesizing a missing ECG lead essentially involves estimating three key aspects of the signal: (a) morphology of the waveform, (b) temporal information or timing of the fiducial points, and (c) the dynamic evolution of morphological and temporal features in terms of inter-lead temporal lags and amplitude variations. Morphology refers to the shape characteristics of an ECG wave, such as T-wave pattern, its skewness, elevation, etc., and the amplitudes of the various waves of a signal in every beat. Accurate morphological information across different leads is necessary to diagnose acute conditions such as arrhythmia. Temporal information refers to the timestamps of events, such as the onset, offset, and peaks of various waves such as p-wave, T-wave etc. within every beat of a signal. Accurate temporal information is necessary to estimate the lengths of various intervals, such as QT and RR, which are employed to diagnose acute conditions such as STEMI and to track the progression of a CVD. The morphology information for any missing lead at time

is obtained from their corresponding historic lead recordings.

More precisely, when a subject is prescribed a handheld ECG system, the healthcare professional can upload their simultaneous 12-lead ECG recording (recorded most easily with any traditional ECG machine) to a cloud-based server. We refer to this simultaneous 12-lead ECG as “historical data”. Further, we know that at any time , the handheld device is used to record a single-lead only (referred to as the “current lead” from now on). Thus, the temporal information is obtained from the signal being currently recorded at time . The rationale for doing so lies in the quasi-periodic nature of the heart rhythms, as a consequence of which, the overall morphology information, i.e., fiducial shape holds significant similarity within a lead (intra-lead morphological correlation). However, even within a lead, there are dynamic variations in both horizontal scale, i.e., timing and vertical scale, i.e., voltage. Nonetheless, these dynamic variations are highly correlated between synchronously recorded leads (inter-lead temporal correlation). Fig. 3 (a) presents a segment of clinical 12-lead ECG showing this intra-lead morphology and inter-lead temporal alignments.

Figure 3: (a) Intra-lead morphology and inter-lead temporal alignment in the ECG signal that forms the basis for the primary synthesis of missing lead (b) representation of the inter-lead temporal lags in the R-peak of lead I, II, and III of a clinical ECG.

While this morphological and temporal information is sufficient for an initial synthesis of the missing leads, the synthesized signals obtained thus are far from accurate due to the error in temporal alignment as a result of inter-lead temporal lags. The inter-lead temporal lags refer to the differences in the time stamp of an event, e.g., the timing of R peak, as recorded across various leads for the same beat. From a physiological standpoint, these lags emerge because each lead signal is a different projection of the spatio-temporal evolution of the electrocardiac activity [13]. Consequently, the timings of the peak values and other events would be different across the various leads. Fig. 3 (b) shows the inter-lead temporal lags in the R peak of lead I, II, and III of a clinical 12-lead ECG recording.

Since the event-timing information of the current lead is used for missing beats synthesis, unresolved inter-lead temporal lag results in bias (as illustrated in Fig. 2) in the timing of reconstructed signals. However, we are unaware of any work in the missing lead synthesis/reconstruction literature that has brought out this aspect.

In this method, once a beat using the single current lead is generated during the stage I, the timings of this synthesized beat are adjusted with the lag values for various events, learned using a machine learning procedure. Using just a single-lead for synchronous reconstruction of missing leads and the lag correction forms the key innovations of the present method. In the next two subsections, we describe the steps for synthesizing the missing leads using the morphology and temporal information from the historic and current leads, respectively, prediction of inter-lead lag, and obtaining the final synthesis after lag correction.

2.1 Obtaining the morphology and temporal information of the missing leads

For the proposed method, extracting the morphological and temporal information happens on a beat to beat basis (RR interval). Hence, the first step after signal pre-processing (noise and baseline wandering removal) is to obtain the fiducial point locations. We use a combination of the Pan-Tompkins and QRS detection algorithm to correctly obtain the fiducial point locations in the current and historical lead signals [21, 22, 23]. Along with robust determination of the fiducial point, the algorithm adapts to the different morphologies of the ECG signal which is crucial for accurate reconstruction in signals with T-wave alternans (TWA) and other abnormalities such as biphasic and inverted T-waves (accuracy shown in the result section).

Figure 4: Every historic and current lead is divided into segments, , each spanning an RR interval between two successive beats. Morphological information gathered from the historic lead and temporal information gathered from the current lead are used to reconstruct a missing lead.

Let represent RR interval of lead . Let the length of this beat be , with the first R peak of the RR occurring at time and the second occurring at time . Finally, let there be beats recorded before the handheld device is switched to a different position and recording for another lead, starts. For example, as shown in Fig. 4, after recording beats, the single-lead device is switched from one location to another. Specifically this figure shows to be lead I and to be lead II. Now, when the RR interval (ignoring the missed beats during lead switch) for lead is recorded, there is no corresponding recording for lead . The synthesis of the RR interval for lead , can be given as:


where, represents affine transformation of the historic RR interval of lead that has maximum similarity to the current RR interval of lead , . Here we used the similarity between RR intervals of lead and is measured using dynamic time warping [24], which allows comparing time-dependent RR intervals with different lengths. This transformation in (1) consists of two essential steps: (i) the energy (voltage) of the synthesized beat is matched with the current beat’s energy and (ii) due to the temporal correlation explained above, the time duration for the synthesized RR interval of lead is considered synchronized with the RR time duration of the current lead , i.e., . Once each RR interval of the missing lead is synthesized, they are concatenated to obtain a complete initial reconstruction of the missing lead.

However, due to the inter-lead event lags, considering results in significant reconstruction error between the synthesized and the original measured signals (as shown in Fig. 5). Fig. 5 (a) shows this error in reconstructed missing beat using the proposed method without correcting the temporal lag. In order to compare how other methods perform in presence of the temporal lag, we used the methodology proposed in one of the recent work in ECG reconstruction using single-lead [17]. The missing beat was reconstructed using a weighted linear combination of asynchronous bipolar leads (we used equal weights) and its reconstruction accuracy is shown in Fig. 5 (b). Given that weighted linear combination is a popular method in the ECG reconstruction literature [25], we linearly combined synchronous bipolar leads using equal weights to compare the reconstruction accuracy. The reconstructed beat as compared to the original beat is shown in Fig. 5 (c).

Since, this inter-lead event lag is an inherent characteristic of ECG leads, irrespective of the reconstruction method used, it always resulted in a temporal bias. Hence, after obtaining the primary reconstruction we employ the lag correction method (detailed in the next subsection) to obtain the final reconstructed signal.

Figure 5: Representation of the temporal bias/misalignment as a result of the inter-lead temporal lag between the actual (blue) and synthesized (red) R beat (a) using the proposed method without lag correction, (b) a recently proposed method of weighted linear combination of asynchronous bipolar leads obtained from a handheld ECG device [17], and (c) a weighted linear combination of synchronous bipolar leads obtained using traditional ECG machines (similar method [26]).

2.2 Inter-lead lag prediction and correction

As detailed in the foregoing, there is an inherent temporal lag even between the synchronously recorded leads of ECG. This lag essentially depends on the person-to-person, lead-to-lead, as well as beat-to-beat changes in the spatiotemporal distribution of the cardiac electrical activity. Further, due to the heart rate variability, the RR interval is time-varying, thus making it a non-trivial task to model this lag. In this work, we used random forests (RF) regression model to predict this dynamic and non-linearly varying inter-lead event lag. Along with improved robustness and generalization, the performance of RF for various applications is comparable to most state-of-the-art methods [27, 28, 29]. As mentioned above, the lag depends on person-to-person and from lead-to-lead, hence the main objective here is to use the historical data in order to build a personalized lag prediction model.

RF [30]

is an ensemble of decision trees generated using B bootstrapped samples from the training data,

, where is a

dimensional feature vector extracted from the historical data corresponding to the current lead

(assuming lead is missing). is the inter-lead lag between the RR interval lengths of any missing lead, and current lead, , i.e., , and

is total number of training data samples. Since during the prediction, only the signal from current lead is available, the training data also constitutes features extracted only from the historic recording corresponding to the current lead only.

More precisely, the construction of bootstrapped decision tree, begins by randomly selecting out of predictors and recursively splitting the predictor space so as to minimize the prediction error of each region thus formed. The value of if usually taken as the square root of the total predictor size, i.e., [31]. The recursive split continues until a termination criterion is not reached (which in most case is the minimum number of observations in the terminal node). Finally, the prediction error is the average of the prediction error for all the trees in the ensemble. This aggregated results of the ensemble provide a more robust result than a single decision tree.

During the training phase of the RF regression model for inter-lead lag prediction, the response vector is the inter-lead lag, between the historical recordings of any missing lead and the current lead . However, the input vector, contains the features extracted just from the current lead’s data. This is done because during the testing, only data from current lead will be available. We utilized 60 seconds of data to obtain the predictors and response for training the model. The fiducial point locations including their onset and offset time in every RR interval. The input feature vector, then constitutes the temporal and amplitude features of all fiducial points (as shown in Table 1) extracted for each RR interval of the current lead’s data. Among the 10 extracted features, R height, RR interval, T height, and ST duration were the most significant.

Amplitude Features Temporal Features
R height () RR interval (ms)
P height () QRS duration (ms)
T height () ST duration (ms)
S peak () PR duration (ms)
Q peak () T-wave duration (ms)
Table 1: Predictors used in the RF regression model. For training the model, these predictors are extracted for every beat of the current lead’s historical data and for testing they are extracted from every beat of the current lead.

Once the predicted lag is obtained, the RR interval length of the synthesized lead is corrected, i.e, . This constitutes the final key step. Fig. 6 shows the original ECG signal/ground truth and the synthesized signal with and without the lag correction methodology. This figure shows the temporal misalignment between the actual signals and the signal synthesized without lag correction. In contrast, the synthesized signal obtained after the lag correction using RF regression has a significantly improved alignment with the actual signal. Now, once every RR interval is corrected for the lag error, they are concatenated together to give the final synthesized lead. In this work, we used only seconds of data for training the RF regression model, a larger training dataset can further improve the lag prediction accuracy.

Figure 6: Performance of missing beat synthesis before (in red) and after (in green) the inter-lead temporal lag correction using random forest algorithm for a representative subject. The subfigure represents the synthesized signal for missing (a) lead III when V3 is the current lead (b) lead V6 when III is the current lead, and (c) lead V4 when V1 is the current lead.

The main steps of the lead synthesis methodology presented in this paper are summarized in Algorithm 1. Missing lead synthesis accuracy and the datasets used for validation are detailed in the following sections.

1:  Input & Preprocess: lead historic lead, lead current lead
2:  Extract: Fiducial Points- Pons, Ppeaks, Poffs, Qpeaks, Rpeaks, Speaks, Tons, Tpeaks, Toffs for lead and
3:  Extract: Amplitude and temporal features for lead %see Table 1
4:   number of RR intervals in lead
5:  for   do
6:     Obtain: , ,
7:     Assign morphology:
8:     Predict: using RF algorithm %see section 2.2
9:     Assign temporal:
10:  end for
11:  for all  do
12:     Concatenate
13:  end for
Algorithm 1 Missing lead synthesis

3 Data and Experimental Protocol

We validated the methodology on two benchmark datasets from Physionet [32]. The first set was the PTB database [33]. The database consists of 80 recordings acquired from 54 healthy volunteers and 368 MI recordings from 148 patients. Each recording in the PTB database contains 15 synchronously recorded signals, namely, the conventional 12-lead ECGs and the three orthogonal Frank XYZ leads. Signals were sampled at 1 kHz rate, with a 16-bit resolution over mV range. The second set of data from Physionet was the T-wave alternans (TWA) database [34]. This database consists of a multichannel recording of 100 subjects with varying degree of TWA. The database includes patients with sudden cardiac death risk factors such as myocardial infarction, transient ischemia etc. The ECG signals were sampled at 500 Hz with 16-bit resolution and mV range. Although for some subjects only 2-3 lead ECG recordings were present, we used the data for subjects with all 12 synchronous recording. Further, subjects with a higher level of TWA were chosen.

For the handheld device, we used the data from Alivecor-Kardia recorded at Heart, artery, and Vein center of Fresno. This dataset consisted of 101 ECG recordings of Kardia device from AliveCor® taken from healthy as well as subjects with acute CVDs. A clinical staff member gathered multiple ECG signals following the 12-lead ECG extraction protocol specified for the Kardia device and saved as PDF files (note that the signals are asynchronous). Signals were recorded at a sampling frequency of 300 Hz and 16 bit A/D resolution. Thereafter, paper ECG signals were collected from each of the subjects using a Burdick-Mortara machine from each of the subjects. The paper ECGs were scanned and subsequently digitized using im2graph software (v.1.20). AliveCor® PDF files were directly used for digitization using the im2graph software. The signals were preprocessed to remove scanning-induced distortions and make the data file compatible with the Physionet database. As a final step, the fiducial beats were extracted as described in section 2.1. The datasets from the first two sources were employed for direct validation and benchmarking of the performance of the proposed approach. However, due to the unavailability of ground truth, Kardia and paper ECG datasets were essentially employed for visual comparison of the morphological and temporal patterns of the reconstructed ECG.

4 Results

In this section, we present the results for synthesizing the missing lead signals for the ECG datasets described in section 3. For Physionet datasets with all 12-lead signals available, we assumed that the handheld ECG device can be used to record either of the 12-leads at a given time and for that time period rest of the 11 signals will be missing. For example, if lead I was used as the current lead for time duration , we can synthesize the rest of the leads using the proposed methodology. Likewise, when the device is switched from lead I location to lead II location at time , all the missing leads at that time can be synthesized and so on. Hence, one can consider this analogous to filling missing elements of a matrix as shown in Fig. 7. In this figure, the column labeled “I” shows the coefficient, accuracies values for synthesizing all other leads when lead I is the current lead. On the other hand, the row labeled “I” in this figure represent the accuracies of reconstructing lead I using rest of the leads as the current lead. Naturally, the diagonal elements have a value of . As is evident from this figure, the reconstruction accuracies are significantly high for every missing lead synthesis.

Figure 7: Accuracy matrix for the synthesis of missing 11 leads given any one of the 12 leads was recorded using the handheld device. Correlation coefficient accuracies are shown for a representative healthy subject in the PTB database with an average accuracy of

Further, Table 2 presents the average synthesis accuracies (across all matrix) in terms of and

values and their respective standard deviation. These accuracies are presented for two online datasets obtained from Physionet and are categorized for subjects with different health conditions. Due to the subtle physiological beat-to-beat variations in the TWA dataset, previous works were unable to reconstruct the signals

[35]. However, our method provides a reasonably high accuracy even for the TWA dataset. These high accuracy values are a result of inter-lead lag correction. Fig. 8 summarizes the improvement in accuracy after the lag correction with blue boxplot representing the synthesis accuracies without the lag correction and the boxplots in red represents the synthesis accuracies with lag correction.

Dataset Average (std) Average (std)
PTB-Healthy 0.91 (0.06) 0.96 (0.02)
PTB-MI 0.95 (0.03) 0.97(0.02)
TWA 0.93 (0.02) 0.96 (0.03)
Table 2: Average and and their standard deviation (std) for the prediction across all the leads of representative cases of Physionet PTB and TWA databases.

Although Table 2 shows high overall synthesis accuracy, we are interested in synthesizing missing signals using only the leads which can be recorded most conveniently using a handheld ECG device such as lead I and the precordial leads. Since lead I is provided by most ECG devices, we selected lead I as the current lead. We notice that (evident from Table. 3) lead I can very successfully be used to synthesize the rest of the missing leads. Additionally, using the precordial leads (which can also be recorded with ease) as current lead also resulted in a high accuracy of missing lead synthesis.

Figure 8: Boxplot demonstrating the accuracy improvement (in terms of and

) for reconstructing all missing leads using lead III of a subject in PTB-TWA database “with” and “without” the proposed inter-lead lag correction. The boxplot is bounded by the 25th and 75th percentile representing the inter-quartile (IQR) range. The middle horizontal line represents the median accuracy value.

Subject Average (std) Average (std)
Healthy Cases 0.93 (0.06) 0.97 (0.03)
Unhealthy Cases 0.92 (0.06) 0.96 (0.03)
Table 3: Average accuracy for synthesizing missing 11 leads using lead I as the current lead for healthy and unhealthy cases in the PTB database

Finally, Fig. 9 presents the synthesized lead signals for the AliveCor-Kardia device. Here, the signals shown in blue are the current signal and the ones shown in red are synthesized using the proposed method. A visual comparison between the actual and synthesized signal for all three leads corroborates the efficacy of the proposed method. Further, we applied the proposed method to the digitized paper ECG database. Only lead I was utilized to reconstruct all other missing leads (Fig. 10). There was no visual difference between the actual and reconstructed signals.

Figure 9: Obtaining asynchronous single-lead recordings (blue) for lead I, II, and III using the AliveCor-Kardia and synthesizing the missing leads (red). Whenever the handheld device was switched from one location to another, the current lead was used to synthesize the other two missing leads.
Figure 10: Representation of missing lead synthesis using the digitized paper ECG database. For an unhealthy subject, all missing ECG leads were synthesized assuming lead I as the current lead.

Finally, Fig. 11 demonstrates the main contribution of this work: (i) the accurate synchronous reconstruction of all the missing leads (and thus any derived signal such as VCG) using a single-lead handheld device (Fig. 11 (a)) and (ii) obtaining accurate dynamic temporal information by the inter-lead lag prediction methodology (Fig. 11 (b)). The reconstructed ECG, as well as VCG, has excellent alignment with their respective measured signals. To our knowledge, this study is the first to do an accurate and synchronous reconstruction of missing leads using a single-lead ECG device.

Figure 11: Comparison of measured ECG (blue) signals versus those reconstructed using the proposed method “with” and “without” gap correction for a representative subject from TWA database. (a) The accuracy of synthesizing lead III using lead I, increased to after gap correction (b) VCG thus reconstructed is significantly aligned with the original VCG obtained using inverse Dower transform.

5 Discussion and Future Work

In this paper, we proposed a potentially pioneering work for synthesizing synchronous 12-lead ECG using single-lead handheld ECG devices. Since multiple signals recorded from the single-lead device are asynchronous, we showed that the traditional methods for 12-lead reconstruction from a subset of available leads yield highly unreliable results. Furthermore, we demonstrated the bias in accuracy arising due to the inter-lead lag, which has not been previously addressed in the extant literature. Finally, we developed a personalized analytics for the prediction of this lag and thereby improve the accuracy of synthesizing the missing signals. For a subject with TWA, the accuracy increased by around 50% (in ) and 20% (for ) after the gap correction. On an average across all the datasets, and increased by 25% and 13%, respectively.

From a scientific standpoint, conventional wisdom as propounded in Dower’s seminal work suggests that signals synchronously acquired from an equivalent of three orthogonal leads would be needed to reconstruct 12-lead ECG. This viewpoint has been challenged in the literature, with some arguments noting the necessity for more, and the others suggesting the sufficiency of fewer than three leads for accurate reconstruction. These efforts have opened the key question of, how well can one reconstruct 12-leads using the information gleaned from a single-lead.

Recent advances in nonparametric machine learning models, such as random forests, allow us to pull almost all information pertinent for 12-lead reconstruction from a single-lead signal without introducing undue biases in terms of model structure or preferences for the ECG features. The present work is one of the initial efforts in extracting this information for 12-lead reconstruction. The results from this effort strongly suggest that information necessary for accurate reconstruction 12 leads can be gleaned from any one of several leads. The foregoing results also demonstrate the potential to further enhance the accuracy of reconstruction by adapting the rapidly growing repertoire of advanced machine learning models

Figure 12: Envisioned eventual goal of developing an integrated POC system using a single-lead handheld device. This system can be employed for acute CVD diagnosis and management using the synthesized 12-lead ECG.

Synthesis of the missing leads is to make the standard ECG available to the cardiologist and primary physicians as they are trained to parse 12-lead signal. Although parsing a single-lead is much easier, it does not give sufficient information for detecting situations leading to sudden cardiac death. The POC handheld ECG devices have significantly improved the POC diagnosis situations for the acute cardiac patients especially for those with limited mobility and comorbidity.

We envision the synthesis of 12-lead from handheld ECG device to be part of a bigger POC system as shown in Figure 12. At the time of handheld ECG system prescription to a subject, their simultaneous 12-lead ECG recording can be uploaded to a secure cloud-based server. Every time the subject uses the handheld device and records a single-lead ECG, that “current” recording can be used to synthesize all the missing 11 lead signals at that time. This generates the 12-lead standard ECG and updates the historical data. Consequently, this 12-lead ECG can be used for automated diagnosis and can be integrated into a user app to provide alert for adverse events. A series of historical data being stored on the cloud server can also be accessed by the healthcare provider for monitoring the long-term health of the subject and provide a more robust diagnosis.

The proposed methodology is simple enough to be implemented in a real-time system. Nonetheless, the main thrust of our methodology relies on accurate detection of fiducial beats. Although our fiducial marker detection scheme was nearly accurate, it can result in delays in the real-time diagnosis system. Hence, one of the possible future work directions could be incorporating a faster beat detection algorithm and to make the overall algorithm faster. Further, an improved lag prediction algorithm can significantly improve the sensitivity of the synthesis. Ultimately, the aim is to use these at-home synthesized 12-lead ECG for near clinical equivalent ECG screening and diagnosing a broader set of cardiac abnormalities, not limited to arrhythmia detection.


We would like to thank the U.S. National Science Foundation under Grant NSF PFI: AIR-TT 1543226 and NSF CMMI 1301439 for their support in this work. We also want to sincerely thank the Heart, Artery, and Veins Center of Fresno, California for providing the Kardia-AliveCor® and the paper ECG data.


  • [1] A. S. Go, D. Mozaffarian, V. L. Roger, E. J. Benjamin, J. D. Berry, M. J. Blaha, S. Dai, E. S. Ford, C. S. Fox, S. Franco et al., “Heart disease and stroke statistics-2014 update,” Circulation, vol. 129, no. 3, 2014.
  • [2] J. Juenger, D. Schellberg, S. Kraemer, A. Haunstetter, C. Zugck, W. Herzog, and M. Haass, “Health related quality of life in patients with congestive heart failure: comparison with other chronic diseases and relation to functional variables,” Heart, vol. 87, no. 3, pp. 235–241, 2002.
  • [3] A. S. Go, D. Mozaffarian, V. L. Roger, E. J. Benjamin, J. D. Berry, W. B. Borden, D. M. Bravata, S. Dai, E. S. Ford, C. S. Fox et al., “Heart disease and stroke statistics—2013 update a report from the american heart association,” Circulation, pp. CIR–0b013e31 828 124ad, 2012.
  • [4] National Center for Health Statistics (US), “Health, united states, 2016: with chartbook on long-term trends in health,” 2017.
  • [5] E. J. Benjamin, M. J. Blaha, S. E. Chiuve, M. Cushman, S. R. Das, R. Deo, S. D. de Ferranti, J. Floyd, M. Fornage, C. Gillespie et al., “Heart disease and stroke statistics—2017 update: a report from the american heart association,” Circulation, vol. 135, no. 10, pp. e146–e603, 2017.
  • [6] AliveCor, “Kardiamobile: Peace of mind in your pocket,” Available at:, 2018, accessed on Jan 24, 2018.
  • [7] HeartCheck™, “Heart rhythm and ecg monitoring simplified,” Available at:, 2018, accessed on Jan 24, 2018.
  • [8] Zio®, “Zio by irhythm: Definitive arrhythmia detection, superior clinical accuracy,” Available at:, 2018, accessed on Jan 24, 2018.
  • [9] J. Orchard, N. Lowres, S. B. Freedman, L. Ladak, W. Lee, N. Zwar, D. Peiris, Y. Kamaladasa, J. Li, and L. Neubeck, “Screening for atrial fibrillation during influenza vaccinations by primary care nurses using a smartphone electrocardiograph (iecg): A feasibility study,” European journal of preventive cardiology, vol. 23, no. 2_suppl, pp. 13–20, 2016.
  • [10] G. S. Wagner, P. Macfarlane, H. Wellens, M. Josephson, A. Gorgels, D. M. Mirvis, O. Pahlm, B. Surawicz, P. Kligfield, R. Childers et al., “Aha/accf/hrs recommendations for the standardization and interpretation of the electrocardiogram,” Circulation, vol. 119, no. 10, pp. e262–e270, 2009.
  • [11] H. Yang, S. T. Bukkapatnam, T. Le, and R. Komanduri, “Identification of myocardial infarction (mi) using spatio-temporal heart dynamics,” Medical Engineering & Physics, vol. 34, no. 4, pp. 485–497, 2012.
  • [12] H. Yang, S. T. Bukkapatnam, and R. Komanduri, “Spatiotemporal representation of cardiac vectorcardiogram (vcg) signals,” Biomedical engineering online, vol. 11, no. 1, p. 16, 2012.
  • [13] T. Q. Le, S. T. Bukkapatnam, B. A. Benjamin, B. A. Wilkins, and R. Komanduri, “Topology and random-walk network representation of cardiac dynamics for localization of myocardial infarction,” IEEE Transactions on Biomedical Engineering, vol. 60, no. 8, pp. 2325–2331, 2013.
  • [14] G. E. Dower, A. Yakush, S. B. Nazzal, R. V. Jutzy, and C. E. Ruiz, “Deriving the 12-lead electrocardiogram from four (easi) electrodes,” Journal of electrocardiology, vol. 21, pp. S182–S187, 1988.
  • [15] D. Dawson, H. Yang, M. Malshe, S. T. Bukkapatnam, B. Benjamin, and R. Komanduri, “Linear affine transformations between 3-lead (frank xyz leads) vectorcardiogram and 12-lead electrocardiogram signals,” Journal of electrocardiology, vol. 42, no. 6, pp. 622–630, 2009.
  • [16]

    G. R. Tsouri and M. H. Ostertag, “Patient-specific 12-lead ecg reconstruction from sparse electrodes using independent component analysis,”

    IEEE journal of biomedical and health informatics, vol. 18, no. 2, pp. 476–482, 2014.
  • [17] C. Y. Huang, C. Lin, L. Y. Lin, Y. J. Lin, C. H. Wang, H. C. Chang, C. C. Liu, C. H. Tseng, M. T. Lo, and M. H. M. Ma, “Synthesizing the 12-lead electrocardiograms by the single-lead ecg system-reconstruction of temporal asynchronous bipolar ecg recordings,” Circulation, vol. 134, no. Suppl_1, p. A20750, 2016.
  • [18] J. W. Grier, “How to use 1-lead ecg recorders to obtain 12-lead resting ecgs and exercise (stress) ecgs,” Available at:, 2008, accessed on Jan 24, 2018.
  • [19] J. B. Muhlestein, V. Le, D. Albert, F. L. Moreno, J. L. Anderson, F. Yanowitz, R. B. Vranian, G. W. Barsness, C. F. Bethea, H. W. Severance et al., “Smartphone ecg for evaluation of stemi: results of the st leuis pilot study,” Journal of electrocardiology, vol. 48, no. 2, pp. 249–259, 2015.
  • [20] G. E. Dower, H. B. Machado, and J. Osborne, “On deriving the electrocardiogram from vectoradiographic leads.” Clinical cardiology, vol. 3, no. 2, pp. 87–95, 1980.
  • [21] W. Engelse and C. Zeelenberg, “A single scan algorithm for qrs-detection and feature extraction,” Computers in cardiology, vol. 6, no. 1979, pp. 37–42, 1979.
  • [22] J. Pan and W. J. Tompkins, “A real-time qrs detection algorithm,” IEEE transactions on biomedical engineering, no. 3, pp. 230–236, 1985.
  • [23] P. Laguna, R. Jané, and P. Caminal, “Automatic detection of wave boundaries in multilead ecg signals: Validation with the cse database,” Computers and biomedical research, vol. 27, no. 1, pp. 45–60, 1994.
  • [24] M. Müller, “Dynamic time warping,” Information retrieval for music and motion, pp. 69–84, 2007.
  • [25] D. D. Finlay, C. D. Nugent, J. G. Kellett, M. P. Donnelly, P. J. McCullagh, and N. D. Black, “Synthesising the 12-lead electrocardiogram: Trends and challenges,” European journal of internal medicine, vol. 18, no. 8, pp. 566–570, 2007.
  • [26] S. P. Nelwan, J. A. Kors, and S. H. Meij, “Minimal lead sets for reconstruction of 12-lead electrocardiograms,” Journal of electrocardiology, vol. 33, pp. 163–166, 2000.
  • [27] C. Strobl, J. Malley, and G. Tutz, “An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests.” Psychological methods, vol. 14, no. 4, p. 323, 2009.
  • [28] J. Arumugam, S. T. Bukkapatnam, K. R. Narayanan, and A. R. Srinivasa, “Random forests are able to identify differences in clotting dynamics from kinetic models of thrombin generation,” PloS one, vol. 11, no. 5, p. e0153776, 2016.
  • [29] K. Afrin, G. Illangovan, S. S. Srivatsa, and S. T. Bukkapatnam, “Balanced random survival forests for extremely unbalanced, right censored data,” arXiv preprint arXiv:1803.09177, 2018.
  • [30] L. Breiman, “Random forests,” Machine learning, vol. 45, no. 1, pp. 5–32, 2001.
  • [31] G. James, D. Witten, T. Hastie, and R. Tibshirani, An introduction to statistical learning.   Springer, 2013, vol. 112.
  • [32] A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley, “Physiobank, physiotoolkit, and physionet,” Circulation, vol. 101, no. 23, pp. e215–e220, 2000.
  • [33] R. Bousseljot, D. Kreiseler, and A. Schnabel, “Nutzung der ekg-signaldatenbank cardiodat der ptb über das internet,” Biomedizinische Technik/Biomedical Engineering, vol. 40, no. s1, pp. 317–318, 1995.
  • [34] G. B. Moody, H. Koch, and U. Steinhoff, “The physionet/computers in cardiology challenge 2006: Qt interval measurement,” in Computers in Cardiology, 2006.   IEEE, 2006, pp. 313–316.
  • [35] P. Langley, S. King, K. Wang, D. Zheng, R. Giovannini, M. Bojarnejad, and A. Murray, “Estimation of missing data in multi-channel physiological time-series by average substitution with timing from a reference channel,” in Computing in Cardiology, 2010.   IEEE, 2010, pp. 309–312.