Classification of Arrhythmia by Using Deep Learning with 2-D ECG Spectral Image Representation

05/14/2020 ∙ by Amin Ullah, et al. ∙ IEEE 21

The electrocardiogram (ECG) is one of the most extensively employed signals used in the diagnosis and prediction of cardiovascular diseases (CVDs). The ECG signals can capture the heart's rhythmic irregularities, commonly known as arrhythmias. A careful study of ECG signals is crucial for precise diagnoses of patients' acute and chronic heart conditions. In this study, we propose a two-dimensional (2-D) convolutional neural network (CNN) model for the classification of ECG signals into eight classes; namely, normal beat, premature ventricular contraction beat, paced beat, right bundle branch block beat, left bundle branch block beat, atrial premature contraction beat, ventricular flutter wave beat, and ventricular escape beat. The one-dimensional ECG time series signals are transformed into 2-D spectrograms through short-time Fourier transform. The 2-D CNN model consisting of four convolutional layers and four pooling layers is designed for extracting robust features from the input spectrograms. Our proposed methodology is evaluated on a publicly available MIT-BIH arrhythmia dataset. We achieved a state-of-the-art average classification accuracy of 99.11%, which is better than those of recently reported results in classifying similar types of arrhythmias. The performance is significant in other indices as well, including sensitivity and specificity, which indicates the success of the proposed method.



There are no comments yet.


page 10

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Cardiovascular diseases (CVDs) are the leading cause of human death, with over 17 million people known to lose their lives annually due to CVDs mc2019cardiovascular . According to the World Heart Federation, three-fourths of the total CVD deaths are among the middle and low-income segments of the society 1 . A classification model to identify CVDs at their early stage could effectively reduce the mortality rate by providing a timely treatment mustaqeem2020modular . One of the common sources of CVDs is cardiac arrhythmia, where heartbeats are known to deviate from their regular beating pattern. A normal heartbeat varies with age, body size, activity, and emotions. In cases where the heartbeat feels too fast or slow, the condition is known as palpitations. An arrhythmia does not necessarily mean that the heart is beating too fast or slow, it indicates that the heart is following an irregular beating pattern. It could mean that the heart is beating too fast—tachycardia (more than 100 beats per minute (bpm)), or slow—bradycardia (less than 60 bpm), skipping a beat, or in extreme cases, cardiac arrest. Some other common types of abnormal heart rhythms include atrial fibrillation, atrial flutter, and ventricular fibrillation. These deviations could be classified into various subclasses and represent different types of cardiac arrhythmia. An accurate classification of these types could help in diagnosing and treatment of heart disease patients. Arrhythmia could either mean a slow or fast beating of heart, or patterns that are not attributed to a normal heartbeat. An automated detection of such patterns is of great significance in clinical practice. There are certain known characteristics of cardiac arrhythmia, where the detection requires expert clinical knowledge.

The electrocardiogram (ECG) recordings are widely used for diagnosing and predicting cardiac arrhythmia for diagnosing heart diseases. Towards this end, clinical experts might need to look at ECG recordings over a longer period of time for detecting cardiac arrhythmia. The ECG is a one-dimensional (1-D) signal representing a time series, which can be analyzed using machine learning techniques for automated detection of certain abnormalities. Recently, deep learning techniques have been developed, which provide significant performance in radiological image analysis

Irmakci et al. (2020); anwar2018medical . Convolutional neural networks (CNNs) have recently been shown to work for multi-dimensional (1-D, 2-D, and in certain cases, 3-D) inputs but were initially developed for problems dealing with images represented as two-dimensional inputs gu2018recent . For time series data, 1-D CNNs are proposed but are less versatile when compared to 2-D CNNs. Hence, representing the time series data in a 2-D format could benefit certain machine learning tasks wu2018comparison ; zhao2019speech . Hence, for ECG signals, a 2-D transformation has to be applied to make the time series suitable for deep learning methods that require 2-D images as input. The short-time Fourier transform (STFT) can convert a 1-D signal into a 2-D spectrogram and encapsulate the time and frequency information within a single matrix. The 2-D spectrogram is similar to hyper-spectral and multi-spectral images (MSI), which have diverse applications in remote sensing and clinical diagnosis, including spectral un-mixing, ground cover classification and matching, mineral exploration, medical image classification, change detection, synthetic material identification, target detection, activity recognition, and surveillance 1a ; 1b ; 1c ; 1d ; 1e ; 1f ; 1g . The 2-D matrix of spectrogram coefficients could be useful for extracting robust features for representation of a cardiac ECG signal salem2018ecg . This representation could allow the application of CNN architectures (designed to operate on 2-D inputs) for development of automated systems related to CVDs.

1.1 Related Works

The ECG signal detects abnormal conditions and malfunctions by recording the potential bio-electric variation of the human heart. Accurately detecting the clinical condition presented by an ECG signal is a challenging task mustaqeem2017statistical . Therefore, cardiologists need to accurately predict and identify the right kind of abnormal heartbeat ECG wave before recommending a particular treatment. This might require observing and analyzing ECG recordings that might continue for hours (patients in critical care). To overcome this challenge for the visual and physical explanation of the ECG signal, computer-aided diagnostic systems have been developed to automatically identify such signals automatically anwar2018arrhythmia . Most of the research in this field has been conducted by incorporating different approaches of machine learning (ML) techniques for the efficient identification and accurate examination of ECG signals mustaqeem2018multiclass ; mustaqeem2017wrapper . The ECG signal classification based on different approaches has been presented in the literature including frequency analysis 2 , artificial neural networks (ANNs) 3

, heuristic-based methods

4 , statistical methods 5

, support vector machines (SVMs)

mustaqeem2018multiclass , wavelet transform 7 , filter banks 8

, hidden Markov models

9 , and mixture-of-expert methods 10 . An artificial neural network based method obtained an average accuracy of 90.6% for the classification of ECG wave into six classes 15

. Meanwhile, a feed-forward neural network was used as a classifier for the detection of four types of arrhythmia classes and achieved an average accuracy of 96.95%

16 .

Machine learning is a subset of artificial intelligence used with high-end diagnostic tools

17 ; 21 ; 21a ; 21b for the prediction and diagnosis of different types of illnesses 22 . Deep learning, as a subset of ML, has many applications in the prediction and prevention of fatal sicknesses, particularly CVDs. Different techniques of deep learning used for the analysis of bioinformatics signals have been presented in 13 ; 17 ; 19

. A recurrent neural network (RNN)


was used for feature extraction and achieved an average accuracy of 98.06% for detecting four types of arrhythmia. For the classification and extraction of features from a 1-D ECG signal, a 1-D convolutional neural network model was proposed

24 and yielded a classification accuracy of 96.72%. Another deeper 1-D CNN model was proposed for the classification of the ECG dataset 25 and obtained an average accuracy of 97.03%. In both instances, a large ECG dataset was used, but the ECG signals were represented as a 1-D time series. A nine-layer 2-D CNN model was applied for an automatic classification of five different heartbeat arrhythmia types achieving an accuracy of 94.03% 28 .

1.2 Our Contributions

The conventional techniques might not achieve efficient results due to the inter-patient variability in ECG signals chen2019smart . Additionally, the efficiency and accuracy of traditional methods could be negatively affected by the increasing size of data 11 ; 12 ; 13 . The techniques presented in literature have been applied to smaller datasets; however, for the purpose of generalization, the performance should be tested on larger datasets. There are methods reported that use 2-D ECG signals salem2018ecg ; Xiong et al. (2017); however, to the best of our knowledge, there are not clear details on how the 1-D ECG signal is converted to 2-D images for using 2-D CNN models. Most methods have been tested on only a few types of arrhythmia and must be evaluated on all major types of arrhythmia. It should be noted that the performance of methods developed for 1-D ECG signals can be further improved. Towards this end, the major contributions of our proposed work are:

  1. [leftmargin=*,labelsep=5mm]

  2. Spectrograms (2-D images) are employed, which are generated from the 1-D ECG signal using STFT. In addition, data augmentation was used for the 2-D image representation of ECG signals.

  3. A state-of-the-art performance was achieved in ECG arrhythmia classification by using the proposed CNN-based method with 2-D spectrograms as input.

The rest of the paper is organized as follows. The proposed algorithm is presented in detail in Section 2. The experiments conducted for the validation of the proposed scheme is presented in Section 3. Classification results are presented in Section 4, and conclusions in Section 5.

2 Proposed Scheme

A schematic representation of the proposed scheme is presented in Figure 1. The method consists of five steps, i.e., signal pre-processing, generation of 2-D images (spectrograms), augmentation of data, extraction of features from the data (using the CNN model), and its classification based on the extracted features. The details of these steps are presented in the following subsections.

Figure 1: Complete procedure of electrocardiogram (ECG) signal classification.

2.1 Pre-Processing

The three primary forms of noise in the ECG signal are power line interference, baseline drift, and electromyographic noise 34 . The noise from the original ECG signal must be removed to ensure that a denoised ECG signal is obtained for further processing. We combined wavelet based thresholding and the reconstruction algorithm of wavelet decomposition to remove noise from the original ECG signal 35 . The wavelet thresholding was performed using,


where represents the wavelet coefficients,

represents the estimated wavelet coefficients after threshold,

represents the scale and represents the shift, represents the threshold, and is a parameter whose value can be set arbitrarily. The wavelet thresholding reduced the electromyographic noise and power line noise interference. Moreover, the reconstruction algorithm of wavelet decomposition was used to remove the baseline drift noise from the noisy ECG signal.

2.2 Generation of 2-D Images

While 1-D CNN can be used for time series signals, the flexibility of such models is limited due to the use of 1-D kernels. On the other hand, 3-D CNNs require a large amount of training data and computational resources. In comparison, 2-D CNNs are more versatile since they use 2-D kernels and, hence, could provide representative features for time series data. Hence, for certain applications where sufficient data is available and for 1-D signals that can be represented in a 2-D format, using a 2-D CNN could be beneficial. Herein, for generating 2-D images to be used with the 2-D CNN model, the ECG signal was transformed into a 2-D representation. The 2-D time-frequency spectrograms were generated using the short-time Fourier transform. The ECG signal represents non-stationary data where the instantaneous frequency varies with time. Hence, such changes cannot be fully represented by just using information in the frequency domain. The STFT is a method derived from the discrete Fourier transform to analyze instantaneous frequency as well as the instantaneous amplitude of a localized wave with time-varying characteristics. In the analysis of a non-stationary signal, it is assumed that the signal is approximately stationary within the span of a temporal window of finite support. The 1-D ECG signals were converted into 2-D spectrogram images by applying STFT as follows,


where is the window length, and is the input ECG signal. The log values of are represented as spectrogram (256 256) images.

2.3 Data Augmentation

Another significant advantage of using 2-D CNN models is the flexibility it provides in terms of data augmentation. For 1-D ECG signals, data augmentation could change the meaning of the data and hence is not beneficial. However, with 2-D spectrograms, the CNN model can learn the data variations, and augmentation helps in increasing the amount of data available for training. The ECG data is highly imbalanced, where most of the instances represent the normal class. In this scenario, data augmentation can help when those classes that are underrepresented are augmented. For arrhythmia classification using ECG signals, augmenting training data manually could degrade the performance. Moreover, classification algorithms such as SVM, fast Fourier neural network, and tree-based algorithms, assume that the classification of a single image based representation of an ECG signal is always the same 26 . The proposed CNN model works on 2-D images of ECG signals as input data, which allows changing the image size with operations such as cropping. Such augmentation methods would add to the training data and hence would allow better training of the CNN model. Another important issue that arises when using small data with CNN based architectures is overfitting. Data augmentation is a way to deal with overfitting and allows better training of a CNN model. For imbalanced data, data augmentation can help in maintaining a balance between different classes. We have used the cropping method for the augmentation of seven classes of ECG beats; namely, premature ventricular contraction beat (PVC), paced beat (PAB), right bundle branch block beat (RBB), left bundle branch block beat (LBB), atrial premature contraction beat (APC), ventricular flutter wave (VFW), and ventricular escape beat (VEB). These are common types of cardiac arrhythmias and are considered in studies we have used for comparison (refer to the Discussion section). While other methods of augmentation are used, such as warping in image processing applications, the aim here is to augment classes that are under-represented. Towards this end, eight different cropping operations (left top, center top, right top, left center, center, right center, left bottom, center bottom, right bottom) were applied. As a result of cropping, we obtain multiple ECG spectrograms of reduced size (200 200), which are then resized to 256

256 images (using linear interpolation) before being fed into the CNN. This resulted in an eight times increase in the training data, which benefited the training process.

2.4 Deep Neural Network

In this study, a CNN-based model is proposed for an automatic classification of arrhythmia using the ECG signal in a supervised manner. The ECG data used in the study have corresponding labels (ground truth) identifying the type of arrhythmia present. These labels were assigned by expert cardiologists and are used for supervised training. For each heartbeat segment, the arrhythmia class label was transferred to the corresponding spectrogram image representation. The first CNN-based algorithm, introduced in 1989 36 , was developed and used for the recognition of handwritten zip codes. Since then, multiple CNN models have been proposed for the classification of images, among which AlexNet 24

has achieved significant performance for a variety of images. The existing neural networks with the feed-forward process for the automatic classification of the 2-D image was not feasible since these methods do not take into account the local spatial information. However, with the development of CNN architectures and using nonlinear filters, spatially adjacent pixels can be correlated to extract local features from the 2-D image. In the 2-D convolution algorithm, the downsampling layer is highly desirable for extracting and filtering the spatial vicinity of the 2-D ECG images. For these reasons, the ECG signal was transformed into a 2-D representation, and a 2-D CNN algorithm was used for classification. Consequently, high accuracy was obtained in the automatic taxonomy of the ECG waves. The details of the proposed CNN model is presented in Section


Figure 2: The architecture of the proposed convolutional neural network (CNN) model. architecture.

3 Experiments

3.1 Dataset

The MIT-BIH arrhythmia dataset consists of 48 records, each having an approximate duration of 30 minutes recorded from a two-channel ambulatory system, collected between 1975 and 1979 moody2001impact . Twenty-three recordings were selected at random from 4000 long term Holter recordings composed of a diverse group of inhabitants of indoor patients (60%) as well as outdoor patients (40%). Twenty-five recordings were chosen from a similar set, with a focus on complex ventricular, junctional, and supra-ventricular arrhythmias. These recordings were digitized at 360 samples/sec for each channel with a resolution of 11-bits over a range of 10 mV. A minimum of two cardiologists were involved in annotating each record and recorded the issues and corresponding solutions needed to reach to the computer-readable outcome. Hence, for the records, approximately 110,000 explanations were documented in this database. The data is publicly available for download here:

3.2 Deep Neural Network Parameters

The performance of the proposed CNN algorithm was compared with AlexNet and VGGNet architectures 26 in terms of the ECG arrhythmia classification. The regular normal beat (NOR) and seven other types of cardiac arrhythmia (VFW, PVC, VEB, RBB, LBB, PAB, and APC) classes were selected from the MIT-BIH arrhythmia database. Although the data is annotated with eighteen different classes, some of the classes have extremely low representation. Moreover, the selected eight types are more commonly found (hence having acceptable representation in the ground truth data) and also used by the methods we have evaluated for comparison. The architecture of the CNN model used in our experiments is shown in Figure 2. A detailed representation of layers within the model are presented in Table 1

. The model follows the CNN architecture with four 2-D convolutional layers. Each convolutional layer is followed by a pooling layer. The output layer is a softmax layer with eight neurons to give the final classification. A fully connected layer is used between the last pooling layer and the output layer and represents the features learned by the CNN model.

Layers Type Filter Size Stride Kernel Input Size Parameters
Layer 1 Conv2-D 3 3 1 64 256 256 1 576
Layer 2 Pooling 2 2 2 - 256 256 64 -
Layer 3 Conv2-D 3 3 1 128 128 128 64 73,728
Layer 4 Pooling 2 2 2 - 128 128 128 -
Layer 5 Conv2-D 3 3 1 256 64 64 128 294,912
Layer 6 Pooling 2 2 2 - 64 64 256 -
Layer 7 Conv2-D 3 3 1 512 32 32 256 1,179,648
Layer 8 Pooling 2 2 2 - 32 32 512 -
Layer 9 Fully Connected - - 4096 16 16 512 2,097,152
Layer 10 Output Layer - - 8 4096 32,776
Table 1: Details of the layers used in the proposed CNN model architecture.

3.3 Experimental Setup

The proposed CNN classifier was implemented in Python with the open source library Tensor Flow

38 , which was developed by Google for deep learning. Substantial computational power and training time were needed to train the CNN model. The experimental setup consisted of an eighth-generation ASUS server with 32GB internal RAM, 500 GB external SSD hard drive with the addition of internal hard drive, and NVIDIA 1080 GPU with 11GB memory. The 2-D spectral images were divided such that 70% of the data was used for training, 30% for test. A 5-fold cross validation was used during the training process. The train/test splits were generated such that there was no overlap between the two splits.

3.4 Cost Function

The cost function is used to measure the error of the CNN model between the estimated worth and the actual worth or the desired quality. An optimizer function was used to minimize the error function. Different cost functions have been used in the neural network theory. In our experiments, we used the cross-entropy function which is given as,


where represents the cost that needs to be minimized, is the number of training points, is the expected or target value, N is the total number of classes, c is the class index, and is the actual value. A gradient descent algorithm was used as an optimizer function with a learning rate of to reduce the error of cost function. Adam optimizer was used in the experiments for training the proposed CNN model, and it reached the optimal point in fewer iterations.

3.5 Evaluation Parameters

Four evaluation metrics were used in this study, including accuracy, precision, sensitivity, and specificity. The accuracy for the multi-class problem was evaluated as,


where denotes the true positives, represents the false positives, represents the true negatives, and represents the false negatives, c represents the class index, and N represents the total number of classes. The accuracy (A) represents the ratio of the correctly classified instances to that of the total number of instances. The precision () and sensitivity () were calculated as,


The specificity (Sp), also known as the true negative rate, was calculated as,


The F1 score was calculated using the precision (P) and recall (Sen) as,


4 Classification Results and Discussion

4.1 Results

The two significant optimization parameters in the proposed 2-D CNN model are the learning rate and the batch size of the data used. To improve the performance, these two optimization parameters must be selected carefully to obtain the best accuracy in the automatic classification of arrhythmia using the ECG signals. The proposed model was evaluated in different experiments with various values of learning parameters. For a smaller value of the learning rate (i.e., less than 0.0005), the speed of the convergence was very slow. However, when the value of the learning rate was large (i.e., greater than 0.001), the speed of convergence improved. At the same time, asymmetrical changes were observed in the accuracy rate. Henceforth, we selected an optimum value of 0.001 for the learning rate, as this value can attain better accuracy for the proposed model (i.e., optimum value), as shown in Table 2.

Learning Rate Batch Size Average Accuracy
0.001 2800 99.11
0.001 2000 98.96
0.001 1000 99.00
0.001 500 98.95
0.001 100 98.93
Table 2: Batch sizes and average accuracy for a learning rate of 0.001.

Similar to the learning rate, the batch size of the data also greatly affected the behavior and accuracy of the model. When the batch size was chosen to be 1000, the accuracy of the system showed abnormally large fluctuations in terms of system convergence. When the batch size was set to 2000, the accuracy of the system increased but did not reach a stable state. When the batch size was further increased to 2800, the accuracy of the proposed model was the highest and reached a stable state. The results are summarized in Table 3.

Batch Size Learning Rate Average Accuracy
2800 0.001 99.11
2800 0.005 98.84
2800 0.100 98.89
2800 0.200 98.91
Table 3: Learning rate and average accuracy for a batch size of 2800.

A detailed performance comparison between the proposed 2-D CNN model and other CNN models (including VGGNet and AlexNet) is presented using confusion matrices for all eight classes. The diagonal elements show the correctly classified classes, whereas anything off diagonal represents an incorrect classification. For the 2-D ECG data used in experiments, results are presented for the VGGNet (Figure 3), AlexNet (Figure 4), and the proposed model (Figure 5). The average accuracy of these three models is presented by averaging the diagonal values.

Figure 3: Confusion matrix for VGGNet.
Figure 4: Confusion matrix for AlexNet.
Figure 5: Confusion matrix for the proposed 2-D CNN based classification model.

4.2 Discussion

Table 4 summarizes the performance evaluation of the proposed CNN algorithm with other classification methods of arrhythmia using ECG signals. The terms ’native’ and ’augmented’ in Table 4 represent the training set without and with data augmentation, respectively. However, a direct comparison of our proposed CNN model with existing techniques may be unfit due to variations in the training and testing dataset, size of the ECG dataset used for experiments, architecture of the CNN models used, and the varying number of types of arrhythmia used for classification. It should be noted that there are various methods that used 1-D data directly for the classification of arrhythmia 42 ; 43 ; 44 ; 45 ; 46 ; 48 ; 49 . Among these methods, 1-D CNN models have been proposed with a lower classification accuracy (46 —96.40% and 49 —93.60%) when compared with the proposed model. We also used 1-D ECG signals as input to the CNN model used in experiments and achieved a classification accuracy of 97.80%. In recent years, 2-D CNN models have also been used, by converting the 1-D ECG signals to 2-D representation, with noticeable performance salem2018ecg . Towards this end, the proposed model was based on a 2-D representation of the ECG data to efficiently apply 2-D CNN models and benefit from the flexibility of data augmentation in such methods.

The proposed 2-D CNN model attained better accuracy, sensitivity, and specificity (in eight class classification) than the FFNN 39 model, which classified only four kinds of arrhythmia. It was observed that the VGGNet model performs worse than the proposed model, albeit a deeper network. One of the reasons for these observations could be the deeper architecture of VGGNet and limited training data. These results prove that the proposed CNN model has the state-of-the-art accuracy for the automatic classification of arrhythmia based on the comparison with different CNN based algorithms. Varying performance among the compared CNN models is due to the difference in their architectures and the number of convolution filters used in these CNN models’ structures. In the proposed CNN model, we employed four convolutional layers, four downsampling (pooling) layers, and one fully connected layer. In the AlexNet model, six convolutional layers, three downsampling layers, and two fully connected layers were used, while the VGGNet model entailed ten convolutional layers, four downsampling layers, and two fully connected layers. By adding a convolutional or a downsampling layer to the architecture of the CNN models, the computational resources and the simulation time for training and testing the models also increase, and this is the main reason for using a carefully selected CNN model. Since we have a limited amount of data, more deeper networks (such as DenseNet or ResNet) would not qualify to perform well within the scope of this problem. The proposed model can be trained on other classes of arrhythmia, although we did not perform this analysis so that we can compare our work with published results that use a 2-D representation of ECG data.

Model Native/Augmentation Classes Accuracy % Sensitivity % Specificity % Precision % F1 Score
FFNN 39 4 96.94 96.31 97.78 - -
PNN 40 8 98.71 - 99.65 - -
SVM 41 6 91.67 93.83 90.49 - -
RNN 42 4 98.06 98.15 97.78 - -
LS-SVM 43 3 95.82 86.16 99.17 97.01 0.91
RFT 44 3 92.16 - - - -
KNN 45 17 97.00 96.60 95.80 - -
1-D CNN 46 5 96.40 68.80 99.50 79.20 0.73
AlexNet 26 Augmented 8 98.85 97.08 99.62 98.59 0.97
AlexNet 26 Native 8 98.81 96.81 99.68 98.63 0.97
VGGNet 26 Augmented 8 98.63 96.93 99.37 97.86 0.97
VGGNet 26 Native 8 98.77 97.26 99.43 98.08 0.97
2-D CNN 48 5 97.42 - - -
1-D CNN 49 7 93.60 - - -
Proposed (1-D) Native 8 97.80 - - - -
Proposed (2-D) Augmented 8 99.11 97.91 99.61 98.58 0.98
Proposed (2-D) Native 8 98.92 97.26 99.67 98.69 0.98
Table 4: Comparison of the proposed model with state-of-the-art ECG classification techniques.

We compared the proposed CNN-based model with recent techniques for the automatic classification of arrhythmia (Table 4), where the algorithm achieved 97.88% average sensitivity, 99.61% specificity, 99.11% average accuracy, and 98.59% positive predictive value (precision). These values indicate improved performance when compared with recent methods using of 1-D and 2-D CNNs, given the same arrhythmia classification. The results also show that the proposed CNN algorithm has better results in terms of accuracy with both the augmented and without augmented data. The proposed model has attained the highest sensitivity among all the compared CNN algorithms. It is pertinent to note that detecting these cardiac arrhythmias is a labor intensive task, where a clinical expert needs to carefully observe recordings that can go for up to hours. With such automated methods, the artificially intelligent system could augment the performance of clinical experts by detecting these patterns and directing the observer to look more closely at regions of more significance. This would ultimately improve the clinical diagnosis and treatment of some of the major CVDs.

5 Conclusions

In this study, we proposed a 2-D CNN-based classification model for automatic classification of cardiac arrhythmias using ECG signals. An accurate taxonomy of ECG signals is extremely helpful in the prevention and diagnosis of CVDs. Deep CNN has proven useful in enhancing the accuracy of diagnosis algorithms in the fusion of medicine and modern machine learning technologies. The proposed CNN-based classification algorithm, using 2-D images, can classify eight kinds of arrhythmia, namely, NOR, VFW, PVC, VEB, RBB, LBB, PAB, and APC, and it achieved 97.91% average sensitivity, 99.61% specificity, 99.11% average accuracy, and 98.59% positive predictive value (precision). These results indicate that the prediction and classification of arrhythmia with 2-D ECG representation as spectrograms and the CNN model is a reliable operative technique in the diagnosis of CVDs. The proposed scheme can help experts diagnose CVDs by referring to the automated classification of ECG signals. The present research uses only a single-lead ECG signal. The effect of multiple lead ECG data to further improve experimental cases will be studied in future work.

Conceptualization, A.U and M.A; Methodology, A.U, R.M, M.A; Validation, A.U, R.M and M.B; Formal Analysis, A.U, M.A; Writing- Original Draft Preparation, A.U, R.M, M.B, M.A; Writing- Review & Editing, M.A, A.U; Supervision, M.A; Funding Acquisition, R.M

This research has funded by the Xiamen University Malaysia Research Fund (XMUMRF) (Grant No: XMUMRF/2019-C3/IECE/0007).


The authors thank for the valuable advice from Prof. Ulas Bagci (Center for Research in Computer Vision (CRCV) Laboratory, University of Central Florida (UCF), Orlando, Florida, USA). This work was supported by the ASRTD at University of Engineering and Technology, Taxila and Xiamen University Malaysia (XMUM). The authors declare that there is no conflict of interest regarding this publication. References


  • (1) Mc Namara, K.; Alzubaidi, H.; Jackson, J.K. Cardiovascular disease as a leading cause of death: How are pharmacists getting involved? Integr. Pharm. Res. Pract. 2019, 8, 1.
  • (2) Lackland, D.T.; Weber, M.A. Global burden of cardiovascular disease and stroke: hypertension at the core. Can. J. Cardiol. 2015, 31, 569–571.
  • (3) Mustaqeem, A.; Anwar, S.M.; Majid, M. A modular cluster based collaborative recommender system for cardiac patients. Artif. Intell. Med. 2020, 102, 101761.
  • Irmakci et al. (2020) Irmakci, I.; Anwar, S.M.; Torigian, D.A.; Bagci, U. Deep Learning for Musculoskeletal Image Analysis. arXiv Prepr. 2020, arXiv:2003.00541.
  • (5) Anwar, S.M.; Majid, M.; Qayyum, A.; Awais, M.; Alnowami, M.; Khan, M.K. Medical image analysis using convolutional neural networks: A review. J. Med. Syst. 2018, 42, 226.
  • (6) Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; et al. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377.
  • (7) Wu, Y.; Yang, F.; Liu, Y.; Zha, X.; Yuan, S. A comparison of 1-D and 2-D deep convolutional neural networks in ECG classification. arXiv Prepr. 2018, arXiv:1810.07088.
  • (8) Zhao, J.; Mao, X.; Chen, L. Speech emotion recognition using deep 1D & 2-D CNN LSTM networks. Biomed. Signal Process. Control 2019, 47, 312–323.
  • (9) Ortega, S.; Fabelo, H.; Iakovidis, D.K.; Koulaouzidis, A.; Callico, G.M. Use of hyperspectral/multispectral imaging in gastroenterology. Shedding some–different–light into the dark. J. Clin. Med. 2019, 8, 36.
  • (10) Feng, Y.-Z.; Sun, D.-W. Application of Hyperspectral Imaging in Food Safety Inspection and Control: A Review. Crit. Rev. Food Sci. Nutr. 2012, 52, 1039–1058.
  • (11) Lorente, D.; Aleixos, N.; Gómez-Sanchis, J.; Cubero, S.; García-Navarrete, O.L.; Blasco, J. Recent Advances and Applications of Hyperspectral Imaging for Fruit and Vegetable Quality Assessment. Food Bioprocess Technol. 2011, 5, 1121–1142.
  • (12) Tatzer, P.; Wolf, M.; Panner, T. Industrial application for inline material sorting using hyperspectral imaging in the NIR range. Real-Time Imaging 2005, 11, 99–107.
  • (13) Kubik, M. Chapter 5 Hyperspectral Imaging: A New Technique for the Non-Invasive Study of Artworks. Phys. Tech. Study Art Archaeol. Cult. Herit. 2007, 2, 199–259.
  • (14) Hassan, H.; Bashir, A. K.; Abbasi, R.; Ahmad, W.; Luo, B. Single image defocus estimation by modified gaussian function. Trans. Emerg. Telecommun. Technol. 2019, 30, 3611.
  • (15) Ahmad, M.; Bashir, A.K.; Khan, A.M. Metric similarity regularizer to enhance pixel similarity performance for hyperspectral unmixing. Optik 2017, 140, 86–95.
  • (16) Salem, M.; Taheri, S.; Yuan, J.S.

    ECG arrhythmia classification using transfer learning from 2-dimensional deep CNN features. In Proceedings of the 2018 IEEE Biomedical Circuits and Systems Conference (BioCAS), Cleveland, Ohio, USA, 17–19 October 2018; IEEE: 2018, pp. 1–4.

  • (17) Mustaqeem, A.; Anwar, S.M.; Khan, A.R.; Majid, M. A statistical analysis based recommender model for heart disease patients. Int. J. Med. Inform. 2017, 108, 134–145.
  • (18) Anwar, S.M.; Gul, M.; Majid, M.; Alnowami, M. Arrhythmia Classification of ECG Signals Using Hybrid Features. Comput. Math. Methods Med. 2018,
  • (19) Mustaqeem, A.; Anwar, S.M.; Majid, M.

    Multiclass classification of cardiac arrhythmia using improved feature selection and SVM invariants.

    Comput. Math. Methods Med. 2018,
  • (20) Mustaqeem, A.; Anwar, S.M.; Majid, M.; Khan, A.R. Wrapper method for feature selection to classify cardiac arrhythmia. In Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Jeju Island, Korea, 11–15 July 2017; IEEE: 2017, pp. 3656–3659.
  • (21) Minami, K. I.; Nakajima, H.; Toyoshima, T. Real-time discrimination of ventricular tachyarrhythmia with Fourier-transform neural network. IEEE Trans. Biomed. Eng. 1999, 46, 179–185.
  • (22) Coast, D.A.; Stern, R.M.; Cano, G.G.; Briller, S.A. An approach to cardiac arrhythmia analysis using hidden markov models. IEEE Trans. Biomed. Eng. 1990, 37, 826–836.
  • (23) Osowski, S.; Hoai, L. T.; Markiewicz, T. Support vector machine based expert system for reliable heartbeat recognition. IEEE Trans. Biomed. Eng. 2004, 51, 582–589.
  • (24) Willems, J. L.; Lesaffre, E. Comparison of multigroup logistic and linear discriminant ecg and vcg classification. J. Electrocardiol. 1987, 20, 83–92.
  • (25) Hu, Y.H.; Tompkins, W.J.; Urrusti, J.L.; Afonso, V.X. Applications of artificial neural networks for ECG signal detection and classification. J. Electrocardiol. 1993, 26, 66–73.
  • (26) Trahanias, P.; Skordalakis, E. Syntactic pattern recognition of the ECG. IEEE Trans. Pattern Anal. Mach. Intell. 1990, 12, 648–657.
  • (27) Inan, O.T.; Giovangrandi, L.; Kovacs, G.T. Robust neural-network-based classification of premature ventricular contractions using wavelet transform and timing interval features. IEEE Trans. Biomed. Eng. 2006, 53, 2507–2515.
  • (28) Hu, Y.H.; Palreddy, S.; Tompkins, W.J. A patient-adaptable ECG beat classifier using a mixture of experts approach. IEEE Trans. Biomed. Eng. 1997, 44, 891–900.
  • (29) Dehan, L.; Guanggui, X.U.; Yuhua, Z.; Hosseini, H.G. Novel ECG diagnosis model based on multi-stage artificial neural networks. Chin. J. Sci. Instrum. 2008, 29, 27.
  • (30) Ceylan, R.; Ozbay, Y. Comparison of FCM, PCA and WT techniques for classification ECG arrhythmias using artificial neural network. Expert Syst. Appl. 2007, 33, 286–295.
  • (31) Polat, K.; Günes, S. Breast cancer diagnosis using least square support vector machine. Digit. Signal Process. 2007, 17, 694–701.
  • (32) Dreiseitl, S.; Ohno-Machado, L.; Kittler, H.; Vinterbo, S.; Billhardt, H.; Binder, M. A comparison of machine learning methods for the diagnosis of pigmented skin lesions. J. Biomed. Inform. 2001, 34, 28–36.
  • (33) Shafiq, M.; Yu, X.; Bashir, A. K.; Chaudhry, H. N.; Wang, D. A machine learning approach for feature selection traffic classification using security analysis. J. Supercomput. 2018, 74, 4867–4892.
  • (34) Bashir, A.K.; Arul, R.; Basheer, S.; Raja, G.; Jayaraman, R.; Qureshi, N.M.F. An optimal multitier resource allocation of cloud RAN in 5G using machine learning. Trans. Emerg. Telecommun. Technol. 2019, 30, 3627.
  • (35) Kononenko, I. Machine learning for medical diagnosis: History, state of the art and perspective. Artif. Intell. Med. 2001, 23, 89–109.
  • (36) Ecar, A. Recommended practice for testing and reporting performance results of ventricular arrhythmia detection algorithms. Assoc. Adv. Med. Instrum. 1987, 69.
  • (37) Huertas-Fernandez, I.; Garcia-Gomez, Garcia-Solis, F.J.D.; Benitez-Rivero, S.; Marin-Oyaga, V.A.; Jesus, S.; Mir, P. Machine learning models for the differential diagnosis of vascular parkinsonism and Parkinson’s disease using [123 I] FP-CIT SPECT. Eur. J. Nucl. Med. Mol. Imaging 2015, 42, 112–119.
  • (38) Salvatore, C.; Cerasa, A.; Battista, P.; Gilardi, M.C.; Quattrone, A.; Castiglioni, I. Magnetic resonance imaging biomarkers for the early diagnosis of Alzheimer’s disease: A machine learning approach. Front. Neurosci. 2015, 9, 307.
  • (39) Kiranyaz, S.; Ince, T.; Gabbouj, M. Real-time patient-specific ECG classification by 1-D convolutional neural networks. IEEE Trans. Biomed. Eng. 2015, 63, 664–675.
  • (40) Rajpurkar, P.; Hannun, A.Y.; Haghpanahi, M.; Bourn, C.; Ng, A.Y. Cardiologist-level arrhythmia detection with convolutional neural networks. arXiv Prepr. 2017, arXiv:1707.01836.
  • (41) Acharya, U.R.; Oh, S.L.; Hagiwara, Y.; Tan, J.H.; Adam, M.; Gertych, A.; San Tan, R. A deep convolutional neural network model to classify heartbeats. Comput. Biol. Med. 2017, 89, 389–396.
  • (42) Chen, J.; Valehi, A.; Razi, A. Smart Heart Monitoring: Early Prediction of Heart Problems Through Predictive Analysis of ECG Signals. IEEE Access 2019, 7, 120831–120839.
  • (43) Lee, S.C. Using a translation-invariant neural network to diagnose heart arrhythmia. In Advances in Neural Information Processing Systems, Morgan Kaufmann, USA; 1990; pp. 240–247.
  • (44) De Chazal, P.; Reilly, R.B. A patient-adapting heartbeat classifier using ECG morphology and heartbeat interval features. IEEE Trans. Biomed. Eng. 2015, 53, 2535–2543.
  • Xiong et al. (2017) Xiong, Z.; Stiles, M.K.; Zhao, J. Robust ECG signal classification for detection of atrial fibrillation using a novel neural network. In Proceedings of the 2017 Computing in Cardiology (CinC), Rennes, France, 24–27 September 2017; IEEE, 2017; pp. 1–4.
  • (46) Clevert, D. A.; Unterthiner, T.; Hochreiter, S. (2015). Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289.
  • (47) Li, D.; Zhang, J.; Zhang, Q.; Wei, X. Classification of ECG signals based on 1D convolution neural network. In Proceedings of the 2017 IEEE 19th International Conference on e-Health Networking, Applications and Services (Healthcom), Dalian, China, 12–15 October 2017; IEEE: 2017; pp. 1–6.
  • (48) Jun, T.J.; Nguyen, H.M.; Kang, D.; Kim, D.; Kim, D.; Kim, Y.H. ECG arrhythmia classification using a 2-D convolutional neural network. arXiv Prepr. 2018, arXiv:1804.06812.
  • (49) Mohanty, M.D.; Mohanty, B.; Mohanty, M.N. R-peak detection using efficient technique for tachycardia detection. In Proceedings of the 2017 2nd International Conference on Man and Machine Interfacing (MAMI), Bhubaneswar, India, 21–23 December 2017; IEEE: 2017; pp. 1—5.
  • (50) Moody, G.B.; Mark, R.G. The impact of the MIT-BIH arrhythmia database. IEEE Eng. Med. Biol. Mag. 2001, 20, 45–50.
  • (51) Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv Prepr. 2016, arXiv:1603.04467.
  • (52)

    Übeyli, E.D. Combining recurrent neural networks with eigenvector methods for classification of ECG beats.

    Digit. Signal Process. 2009, 19, 320–329.
  • (53) Dutta, S.; Chatterjee, A.; Munshi, S. Correlation technique and least square support vector machine combine for frequency domain based ECG beat classification. Med. Eng. Phys. 2010, 32, 1161–1169.
  • (54)

    Kumar, R.G.; Kumaraswamy, Y.S. Investigating cardiac arrhythmia in ECG using random forest classification.

    Int. J. Comput. Appl. 2012, 37, 31–34.
  • (55) Park, J.; Lee, K.; Kang, K. Arrhythmia detection from heartbeat using k-nearest neighbor classifier. In Proceedings of the 2013 IEEE International Conference on Bioinformatics and Biomedicine, Shanghai, China, 18–21 December 2013; IEEE: 2013; pp. 15–22.
  • (56) Ince, T.; Kiranyaz, S.; Eren, L.; Askar, M.; Gabbouj, M. Real-time motor fault detection by 1-D convolutional neural networks. IEEE Trans. Ind. Electron. 2016, 63, 7067–7075.
  • (57) Izci, E.; Ozdemir, M. A.; Degirmenci, M.; Akan, A. Cardiac Arrhythmia Detection from 2D ECG Images by Using Deep Learning Technique. In Proceedings of the 2019 Medical Technologies Congress (TIPTEKNO), Selçuk, Turkey, 3–5 October 2019; (pp. 1–4). IEEE.
  • (58) Rajkumar, A.; Ganesan, M.; Lavanya, R. Arrhythmia classification on ECG using Deep Learning. In Proceedings of the 2019 5th International Conference on Advanced Computing and Communication Systems (ICACCS), March 15-16, 2019, Coimbatore, India; IEEE: 2019; pp. 365–369.
  • (59) Guler, I.; Ubeylı, E.D. ECG beat classifier designed by combined neural network model. Pattern Recognit. 2005 38, 199–208.
  • (60)

    Yu, S.N.; Chou, K.T. Integration of independent component analysis and neural networks for ECG beat classification.

    Expert Syst. Appl. 2008, 34, 2841–2846.
  • (61)

    Melgani, F.; Bazi, Y. Classification of electrocardiogram signals with support vector machines and particle swarm optimization.

    IEEE Trans. Inf. Technol. Biomed. 2008, 12, 667–677.