Electrocardiogram Generation and Feature Extraction Using a Variational Autoencoder

02/01/2020 ∙ by V. V. Kuznetsov, et al. ∙ State University of Nizhni Novgorod 0

We propose a method for generating an electrocardiogram (ECG) signal for one cardiac cycle using a variational autoencoder. Using this method we extracted a vector of new 25 features, which in many cases can be interpreted. The generated ECG has quite natural appearance. The low value of the Maximum Mean Discrepancy metric, 0.00383, indicates good quality of ECG generation too. The extracted new features will help to improve the quality of automatic diagnostics of cardiovascular diseases. Also, generating new synthetic ECGs will allow us to solve the issue of the lack of labeled ECG for use them in supervised learning.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

All the experience gained by the machine learning community shows that the quality of the decision rule largely depends on what features of samples are used. The better the feature description, the more accurately the problem can be solved. Typically, the features require their interpretability, since it means the adequacy of the features to the real-world problem.

The traditional way to build a good feature description was to use an expert knowledge. Specialists in a particular subject area offer various methods for constructing the feature descriptions, which are then tested in solving practical problems. Another approach for constructing a good feature description is automatic feature extraction (also called dimensionality reduction).

There is a lot of methods for automatic feature extraction, such as principal component analysis, independent component analysis, principal graphs and manifolds, kernel methods, autoencoders, embeddings etc. Among the most powerful and perspective approaches, we mention principal graphs and manifolds


and methods used deep learning

[11, 2].

Here we examine a method for automatic feature extraction, so called variational autoencoder (VAE) [10, 7], for the problem of automatic electrocardiogram (ECG) processing.

The electrocardiogram is a record of the electrical activity of the heart, obtained with the help of electrodes located on the human body. Electrocardiography is one of the most important methods in cardiology. Schematic representation of the main part of ECG is shown in Figure 1. One cardiac cycle (the performance of the heart from the beginning of one heartbeat to the beginning of the next) contains P, T, U waves and QRS complex, consisting of Q, R and S peaks. The size, shape, location of these parts gives great diagnostic information about the work of the heart and about the presence/absence of certain diseases.

Fig. 1: Schematic representation of main parts of the ECG signal for one cardiac cycle: P, T, U waves and QRS complex, consisting of Q, R and S peaks.

Recently, machine learning (especially deep learning) methods are widely used for automatic ECG analysis. See the recent review [5]. The application tasks include ECG segmentation, disease detection, sleep staging, biometric human identification, denoising, and the others [5]

. A variety of classical and new methods are used. Among them are discriminant analysis, decision trees, support vector machine, fully-connected and convolutional neural networks, recurrent neural networks, generative adversarial networks, autoencoders etc

[13, 5].

From our point of view, the most interesting and fruitful directions in applying deep learning methods to ECG analysis is the generation of synthetic ECG and automatic extraction of new interpretable features. The problem of ECG generation is devoted to several works [14, 6, 1]. The authors of those papers used different variations of generative adversarial networks (GANs) [3]. The best results concerning the ECG generation were obtained in [6]. The authors report on the Maximum Mean Discrepancy (MMD) metric equals to .

Our approach to generate ECG is based on VAE. We propose a neural network architectures for an encoder and a decoder for generating synthetic ECGs and extracting new features. The generated synthetic ECGs look quite natural. MMD equals to , which is worse than the value obtained in [6], but we note that the comparison of these two metric values is not very correct, since the values were obtained on different training tests and for solving similar, but different problems.

The main advantage of our work is that we propose the method for extracting new features. Our experiments show that these features are quite interpretable. This fact allows us to hope that using these features will help to improve the quality of automatic diagnostics of cardiovascular diseases. Also, generating new synthetic ECGs will allow us to fix the issue of the lack of labeled ECG for use them in supervised learning.

Fig. 2: Encoder architecture
Fig. 3: Decoder architecture

Ii Algorithm

Ii-a Preprocessing

Our original ECG is a -second -lead signal with a frequency of Hz. Each signal is cut into nine-second signals. Using the segmentation algorithms described in [12], we determine beginnings and endings of all P and T waves and all the picks R. Next, we do the step forward and backward from the R pick at an equal distance. Thus, we obtain the set of cardiac cycles, each of which of length .

Ii-B Neural network architecture. Encoder

A variational autoencoder [10, 7] consists of an encoder and a decoder. We propose the followin architecture for them. The encoder consists of a convolutional and a fully connected blocks. The architecture of the encoder is presented in Figure 2. The input vector of length is fed to the input of the encoder. Next, there is a branching into a fully connected and convolutional chains.

The convolutional chain (at the top of the circuit in Figure 2) consists of

series-connected blocks, each of which consists of a convolution layer, a batch normalization layer, a ReLU activation function and a MaxPooling layer. Next, we have another convolution layer. At the output of this block we get


The fully connected chain of the encoder (at the bottom of the circuit in Figure 2) consists of fully connected (dense) layers, interconnected by a batch normalization and ReLU activation functions. At the output of the last fully connected layer we have neurons.

The outputs of the convolutional and fully connected chains are concatenated, which gives us a vector of length . Using two fully connected layers we get two

-dimensional vectors which interpreted as a vector of means and a vector of logarithms of variances for

normal distributions (or for one -dimensional normal distribution with a diagonal covariance matrix). The output of the encoder is a vector of length in which each component is sampled from those normal distributions with specified means and variance.

We will interpret this -dimensional vector as a vector of new features sufficient to describe and restore with small error the one cardiac cycle.

As the loss function, the Kullback–Leibler distance


is used. Due to this fact those new features are of normal distribution. In (1) is any measure on for which there exists a function absolutely continuous with respect to : and , is the initial distribution, is the new distribution we have obtained.

Ii-C Neural network architecture. Decoder

The architecture of the decoder is presented in Figure 3. As an input, the decoder accepts the -dimensional vector of features. Then, similarly to the encoder, branching into convolutional and fully connected chains occurs.

The fully connected chain (at the bottom of the circuit in Figure 3) consists of blocks, each of which contains a fully connected (dense) layer, batch normalization layer and the ReLU activation function.

The convolutional chain (at the top of the circuit in Figure 3) performs a deconvolution. It consists of blocks consisted of a convolutional layer, a batch normalization layer, and ReLU activation function, followed by and an upsampling layer.

As a result of the convolutional and the fully connected chains, we get neurons from each. Next, we concatenate two results, obtaining neurons. Using a dense layer we get neurons which represents the ECG restored.

As a loss function for the output of the decoder, we use the mean squared error.

Iii Experimental results

As a training test, we use -second ECG signals of frequency Hz [8, 9]. We process them according to the principles as described above and train our network on the obtained cardiac cycles. Examples of those cardiac cycles are presented in Figure 4.

Fig. 4: Examples of cardiac cycles in the training sets.

After training the network we can test the decoder by supplying random (generated according to the standard normal distribution) numbers to its input. The examples of the produced results are given in Figure 5. These synthetic generated ECG look quite natural.

Fig. 5: Examples of generated cardiac cycles.

Also, for evaluating our results we calculated the Maximum Mean Discrepancy (MMD) metric (see [6]) on the set of generated ECG. The value of MMD is equal to . Remark that the best value of MMD obtained in [6] by GAN is . However, we note that the comparison of these two metric values is not very correct, since these values were obtained on different training tests and for solving similar, but different problems. Unfortunately, the papers [14, 1] don’t contain (applicable to our problem) values of similar metrics.

Fig. 6: Examples of ECG generated when a parameter is varying. Each column correspond to the set of fixed features and varying other feature.

Interesting results were obtained when generating ECG with a varying feature. Some generated ECG signals are presented in Figure 6. For each test, features were fixed when the remaining feature changed. It was possible to find a parameter responsible, for example, for the height of the wave T, the depression of the ST wave, etc. Thus, in many cases, the extracted features can be interpreted, which also confirms the high quality of the constructed feature description.

Iv Conclusions and further research

In this paper, we proposed a neural network (variational autoencoder) architecture that is used to generate an ECG corresponding to a single cardiac cycle. Our method generates synthetic ECGs with completely natural appearance, which can be used to augment the training sets in supervised learning problems involving ECG. Also, our method allowed us to extract new features that accurately characterize the ECG. Experiments show that the extracted features are usually amenable to good interpretation.

We plan to use our approach to generate the entire ECG, not just one cardiac cycle. We will also use the extracted features to improve the quality of automatic diagnosis of cardiovascular diseases.