HAN-ECG: An Interpretable Atrial Fibrillation Detection Model Using Hierarchical Attention Networks

02/12/2020 ∙ by Sajad Mousavi, et al. ∙ Northern Arizona University 0

Atrial fibrillation (AF) is one of the most prevalent cardiac arrhythmias that affects the lives of more than 3 million people in the U.S. and over 33 million people around the world and is associated with a five-fold increased risk of stroke and mortality. like other problems in healthcare domain, artificial intelligence (AI)-based algorithms have been used to reliably detect AF from patients' physiological signals. The cardiologist level performance in detecting this arrhythmia is often achieved by deep learning-based methods, however, they suffer from the lack of interpretability. In other words, these approaches are unable to explain the reasons behind their decisions. The lack of interpretability is a common challenge toward a wide application of machine learning-based approaches in the healthcare which limits the trust of clinicians in such methods. To address this challenge, we propose HAN-ECG, an interpretable bidirectional-recurrent-neural-network-based approach for the AF detection task. The HAN-ECG employs three attention mechanism levels to provide a multi-resolution analysis of the patterns in ECG leading to AF. The first level, wave level, computes the wave weights, the second level, heartbeat level, calculates the heartbeat weights, and third level, window (i.e., multiple heartbeats) level, produces the window weights in triggering a class of interest. The detected patterns by this hierarchical attention model facilitate the interpretation of the neural network decision process in identifying the patterns in the signal which contributed the most to the final prediction. Experimental results on two AF databases demonstrate that our proposed model performs significantly better than the existing algorithms. Visualization of these attention layers illustrates that our model decides upon the important waves and heartbeats which are clinically meaningful in the detection task.



There are no comments yet.


page 4

page 7

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Atrial fibrillation (AF) is a common cardiac arrhythmia that can lead to various heart-related complications such as stroke, heart failure and atrial thrombosis (Furberg et al., 1994). Electrocardiography is a test which measures the electrical activity of the heart over a specific period of time. The test output is an Electrocardiogram (ECG) signal that is a plot of voltage against time. A common non-invasive diagnosis way for the AF detection is the process of the recorded electrocardiogram (ECG) signal visually by a cardiologist or medical practitioner. However, this is a time-consuming process and subject to human error.

Therefore, several computer-aided methods have been developed for automatic detection of atrial fibrillation and other heart arrhythmia. The existing ML-based methods include hand-crafted feature-based and automatic-extracted feature-based approaches in their solutions (Ghaffari and Madani, 2019; Xia et al., 2018; Acharya et al., 2017; Asgari et al., 2015; Zhou et al., 2014; Lee et al., 2013; Lake and Moorman, 2010). Among them, the methods that extract features automatically have gained more attention because they could learn the ECG signal representations efficiently and achieve the state-of-the-art results.

Machine learning approaches, especially deep learning techniques have achieved significant results in different applications ranging from computer vision and reinforcement learning to unmanned aerial vehicle (UAVs) navigation

(Xu et al., 2015; Mousavi et al., 2016a, 2017b, b, 2017a, 2018; Shamsoshoara et al., 2019c, b, a) as well as healthcare domain (Rajpurkar et al., 2017; Mousavi et al., 2019a; Xia et al., 2018; Zaeri-Amirani et al., 2018)

. Deep Learning models with the capability of automatic feature extracting provide significant performance in the AF detection task noting their ability to detect complex patterns in the ECG signals

(Yıldırım et al., 2018; Rajpurkar et al., 2017). Nevertheless, they work as black boxes that makes it hard to understand the reasons behind their decisions. Interpretability and transparency are key required factors in AI-based decision making in the healthcare to enable and encourage the physicians who are held accountable for medical decisions to trust the recommendations of these algorithms. One way to make deep learning models interpretable is to incorporate an attention mechanism in the model that learns the relationship between the input data samples and the given task (Ma et al., 2017). To provide an interpretable method with high performance for automatic detection of the atrial fibrillation, in this study, we propose a deep learning model powered by hierarchical attention networks. The proposed method is composed of three parts in which each part contains a stacked bidirectional recurrent neural networks (BiRNN) followed by an attention model. The first part learns a wave level representation of the ECG signal, the second part learns a heartbeat level representation of the ECG signal and the third part learns a window-based (i.e., contains multiple heartbeats) level representation of the ECG signal. All learned representations at each level are interpretable and are able to show which parts of the input signal are the reasons to trigger an AF event. The performance of the proposed model was evaluated by two datasets of PhysioNet community, including the MIT-BIH AFIB database and the PhysioNet Computing in Cardiology Challenge 2017 dataset. The experiment results demonstrate that the proposed method significantly outperforms the state-of-the-art algorithms for the atrial fibrillation task. In addition, we show the interpretability of the learned representations in all levels through visualizations. The main contributions of this study are summarized as follows:

  • We propose an end-to-end hierarchical attention model that achieve the state-of-the-art performance with the capability of the interpretability.

  • The proposed model provide a multi-level resolution interpretability (i.e., window by window (multiple heartbeats), heartbeat by heartbeat and wave by wave levels).

  • We empirically demonstrate that the important parts of the ECG signal for the model in triggering the AF are clinically meaningful.

  • The proposed approach can be used to recognize new potential patterns leading to trigger life-threatening arrhythmias.

The hierarchical attention model was first proposed in (Yang et al., 2016) in the content of document classification task, as a novel hierarchical attention architecture that matches the hierarchical nature of a document, meaning words make sentences and sentences make document. Since in the ECG analysis application, we deal with a similar notion of hierarchical input where the ECG signal includes multiple levels of resolution (waves, beats and windows)), the proposed hierarchical attention model can mirror the physicians’ decision-making process. For instance, in order to detect AF, they, first, look for some important windows (a sequence of continuous heartbeats), next, they look at the important heartbeats of the windows, and then focus on the heartbeat waves.

The rest of this paper is organized as follows. Section 2 gives a review of the related work. Section 3 provides a detailed description of the proposed approach. Section 4 presents the experimental setup, the used databases to evaluate the proposed model, and compares the performance of the proposed algorithm to the existing algorithms following by the interpretability analysis. Finally, Section 5 concludes the paper.

2. Related Work

Life-threatening arrhythmia classification and prediction tasks are very important research problems in machine learning for healthcare area. Recent advances of deep learning algorithms have impacted on achieving great performance in the machine learning-oriented healthcare problems. Deep convolutional neural networks have been used to improve the performance of ECG heartbeat classification task

(Kachuee et al., 2018; Acharya et al., 2017; Jun et al., 2018; Yıldırım et al., 2018). Recurrent neural networks (RNNs) and sequence to sequence models were employed to perform automatic heartbeat annotations (Saadatnejad et al., 2019; Gao et al., 2019; Mousavi and Afghah, 2019). Deep learning models have also been utilized to detect false arrhythmia alarms. (Lehman et al., 2018)

applied a supervised senoising autoencoders (SDAE) to ECG signals to classify ventricular tachycardia alarms.

(Mousavi et al., 2020) used an attention-based convolutional and recurrent neural networks to suppress false arrhythmia alarms in the ICU.

Atrial fibrillation (AF) is one of the most common type of arrhythmias in the patients with heart diseases and challenging arrhythmias to detect. (Fujita and Cimr, 2019; Ghaffari and Madani, 2019; Xia et al., 2018) aimed to use deep convolutional neural networks for the atrial fibrillation arrhythmia detection task and achieved good arrhythmia detection performance. (Shashikumar et al., 2018)

applied an attention mechanism to detect the atrial fibrillation arrhythmia. Authors employed a deep recurrent neural network on 30-second ECG windows’ inputs, and also fed some time series covariates to the network. These covariates are hand-crafted features and include the standard deviation and sample entropy of the beat-to-beat interval time series. Although they have used an attention mechanism in the architecture of their model, their proposed method was not an interpretable detective model. The single-level attention applied on fixed-length 30s ECG windows, which contains several heartbeats only improves the detective performance. Our previous work named ECGNET

(Mousavi et al., 2019b) is an interpretable atrial fibrillation detective model, which uses a deep visual attention mechanism to automatically extract features and focus on different parts of the heartbeats of the input ECG signal. The ECGNET has suggested an interpretable AF detection with a single-level attention using the wavelet power spectrum as input, however, this study propose a hierarchical attention network having raw ECG signals as input.

Unlike the aforementioned AF detective models, the proposed model provides a high resolution interpretable predictive model (i.e., window (i.e., multiple heartbeats) by window, beat by beat and wave by wave levels) using the hierarchical bidirectional recurrent neural networks and attention networks. The proposed model improve the detection performance and explain the reasons behind model decisions simultaneously.

Figure 1. Illustration of an ECG signal; The red circles indicate R peaks; green, blue and black curves illustrate P, QRS and T-waves, respectively.
Figure 2. Architecture of hierarchical attention network (HAN) for Atrial Fibrillation Detection.
Figure 3. Schematic diagram of the attention mechanism.

3. Methodology

In this section, we first describe the pre-processing steps needed to prepare the data to be fed into the proposed model. Then, we explain the main components of the proposed method in detail.

3.1. Pre-processing

The input of the proposed method is a sequence of ECG heartbeats in which each heartbeat contains a sequence of building waves (P-wave, QRS complex, T-wave, etc.). To prepare this structure of ECG signals, we perform a few pre-processing steps on them as follows:

  1. Removing the baseline wander and power-line interference noises in the ECG signal. To this end, the ECG signal was passed through a band-pass filter with a filter order of 10 and pass band of 0.5–50 Hz.

  2. Transforming the given ECG signal values to have a zero mean and a unit standard deviation (i.e., standardization)

  3. Detecting the R-peaks of given ECG signal or detecting the QRS complexes using the Pan–Tompkins algorithm (Pan and Tompkins, 1985).

  4. Dividing the continuous ECG signal into a sequence of heartbeats, and split the heartbeats into distinct units named waves. The waves in the ECG signal are extracted based on the extracted R-peaks and using adaptive searching windows, and a heartbeat is defined from the onset of the current P-wave to the offset of consecutive T-wave. Figure 1 depicts a segmented ECG signal annotated with the R-peaks, P, QRS and T-waves.

After doing the above pre-processing steps, each ECG signal becomes a sequence of heartbeats in which each heartbeat, contains waves, where represents the ( wave in the () heartbeat, .

3.2. Model

The goal of the proposed method is to detect life-threatening atrial fibrillation arrhythmia in an explainable way. Figure 2

presents the network architecture of the proposed method. A sequence of waves of an ECG signal is fed into a stacked bidirectional recurrent neural networks (BiRNN) followed by an attention model. The stacked BiRNNs are used to extract a vector representation for each input wave and the attention model is used to focus on those waves that are the best representatives of a heartbeat. Next, the vector representations of the waves are integrated to represent a heartbeat vector. Then, the heartbeat vectors of the previous step are introduced to other stacked BiRNNs followed by another attention model. Similarly, the attention model puts more emphasis on the important heartbeats and produces the heartbeat context vectors. After that, a sequence of windows in which each window contains multiple heartbeat context vectors is computed and the same procedure is applied to the windows and a summarized vector that includes all information of the ECG signal is extracted. Finally, the summarized vector can be used for the atrial fibrillation detection task. Overall, the model architecture is composed of three main parts: a wave encoder along with a wave attention, a beat encoder along with a beat attention, and a window encoder along with a window attention. In the following sections, we explain each part of the proposed model in detail.

Bidirectional Recurrent Neural Networks

Bidirectional recurrent neural network (BiRNN) is more efficient than the RNN while the length of the sequence is very large (Schuster and Paliwal, 1997). The reason is that standard RNNs are unidirectional so they are restricted to only use the previous input state. However, the BiRNN can process data in both forward and backward directions. Therefore, the current state has access to previous and next input information simultaneously. In order to improve the detection performance, the BiRNNs were employed in the model for encoding the sequences of wave and beat vectors.

The BiRNN consists of forward and backward networks. The forward network takes in a sequence of waves/beats in a regular order, from to , as input and computes forward hidden state, and the backward network takes in wave/beat sequence in a reverse order, from to , as input and calculates backward hidden state, . Then, the output of the BiRNN is considered as a weighted sum over the concatenation of the forward hidden state, and the backward one, . The BiRNN can be defined mathematically as follows:


where (, ) are the hidden state and the bias of the froward network, and (, ) are the hidden state and the bias of the backward one. In addition, and are the input and the output of the BiRNN, respectively.

Wave Encoder and Wave Attention

A sequence of waves, , for heartbeat, is fed into a bidirectional recurrent neural network (BiRNN) to encode the wave sequence. The forward network of the BiRNN gets the heartbeat, in a normal time order of waves from to and the backward network gets the heartbeat, in a reverse time order of waves from to . Then, the BiRNN outputs, representing a low dimensional latent vector representation of the heartbeat, .

Similar to the words of a sentence in which necessarily all words are not important to give the meaning of the sentence (Yang et al., 2016), herein all waves of a heartbeat do not have the same weights in representing the heartbeat. Therefore, an attention mechanism is able to extract the relevant waves of the heartbeat that contribute more to the meaning of the heartbeat. The attention mechanism is a shallow neural network that takes the BiRNN output,

as input and computes a probability vector,

corresponding to the importance of each wave vector. Then, it calculates a wave context vector, which is a weighted sum over with the weight vector (as shown in Figure 3). Indeed,


where are the parameters to be learned and is a function that squeezes its input, which is a vector of real numbers, in values between 0 and 1.

Beat Encoder and Beat Attention

Similar to the wave encoder part, the BiRNN of the beat encoder part takes a sequence of wave context vectors, as input and produces vectors, which are latent representations of the input heartbeats. To put emphasise on the more important heartbeats in triggering the arrhythmia, another attention mechanism is used on the heartbeat level. Therefore,


where are the parameters to be learned, is the attention weight vector of the heartbeats, and is the heartbeat context vectors which is an element-wise product of the hidden states, and the importance of each heartbeat, .

Window Encoder and Window Attention

In addition to the wave and heartbeat level encoding modules, we also consider a window level encoding module in which a window contains multiple heartbeats. The heartbeat context vectors, are converted to a sequence of windows, by sliding a predefined-fixed-length window with a predefined-fixed hop size in the heartbeats over the heartbeat context vectors (as shown in Figure 2). For example, if we consider a window with heartbeats and the hop size be 1, the extracted sequence of windows becomes where is a simple concatenation operation.

Analogous to the previous steps, a BiRNN is used to encode the windows, and again an attention mechanism is employed to measure the importance of the windows. Specifically,


where are the parameters to be learned, is the attention weight vector of the windows, are the hidden states of the BiRNN, and is the window context vector that encompasses the whole information of the windows, containing multiple heartbeats, of the input ECG signal.


We concatenate the window context vector, and the last hidden state,

, to obtain a combined information of both vectors and then feed it into a shallow network followed by a softmax layer to produce a probability vector,

in which each element determines the probability of the input signal belonging to each class of interest (AF or non-AF). Specifically,


where are the parameters to be learned.

Finally, we use a cross entropy loss to calculate the training loss as follows:


where is the vector dot product operator and is the ground truth vector.


Typically artificial intelligence (AI)-based algorithms that both give good performance and are interpretable, are preferable to apply to real medical practice. Therefore, having machine learning algorithms that explain the reasons behind their decisions are very important in medical applications. The proposed method has three levels of the attention mechanism, the first level (i.e., the wave level) produces the wave weights, ) representing the importance of the waves in a heartbeat, the second level (i.e., the heartbeat level) computes the heartbeat weights, ) showing the amount of the influence of each heartbeat on the occurrence of an arrhythmia, and third level (i.e., the window level) produces the window weights, ) demonstrating the importance of the combinations of the heartbeats. In Section 4, we provide visualized examples of some ECG signals with the AF and non-AF arrhythmias where the focused portions of the signals determined by the proposed attention mechanism are highlighted.

4. Experiments

In this section, we describe the two atrial fibrillation datasets used for the quantitative and qualitative analyses of the proposed method. Then, we compare its performance against the existing algorithms for the atrial fibrillation detection task and show how explainable the proposed model is in detecting the atrial fibrillation arrhythmia.

4.1. Data Description

To evaluate the proposed method, we used two datasets including the MIT-BIH AFIB database (PhysioNet, 2000) and the PhysioNet Computing in Cardiology Challenge 2017 dataset (PhysioNet, 2001).

MIT-BIH AFIB Dataset: This dataset contains 23 long-term ECG recordings of human subjects with mostly atrial fibrillation arrhythmia. Each patient of the MIT-BIH AFIB includes two 10-hours long ECG recordings (ECG1 and ECG2). The ECG signals are sampled at 250 Hz with 12-bit resolution over a range of millivolts. In this study, we divided each ECG signal into 5-s segments and labeled each based on a threshold parameter, . To perform the segment labeling, we followed the approach reported in (Xia et al., 2018; Asgari et al., 2015). A 5-s segment is labeled as AF if the percentage of annotated AF beats of the segment is greater than or equal to , otherwise it is determined as a non-AF arrhythmia. We chose to be consistent with the previous research work. In our experiments, we used the ECG1 recordings and extracted a total of 167,422 5-s data segments in which the the number of AF segments was 66, 939 and the number of non-AF segments was 100, 483. As it is clear, the data segments are imbalanced. To deal with this problem and be able to compare our proposed method to the other existing algorithms, we randomly drew the same number of samples for both AF and non-AF classes (considered 66, 939 samples for each class). However, we tested the proposed method on the original imbalanced dataset as well.

PhysioNet Challenge AFIB Dataset: The goal of the challenge is to build the models to classify a single short ECG lead recording (30-60s in length) to normal sinus rhythm, atrial fibrillation (AF), an alternative rhythm, or too noisy classes. The training set includes 8,528 single lead ECG recordings and the test set contains 3,658 ECG recordings. The test set has not been publicly available yet, therefore we use the training set for both test and training phases. The ECG recordings were recorded by AliveCor devices, sampled as 300 Hz and filtered by a band pass filter. In this study, we considered only two classes including the normal sinus rhythm (N) and atrial fibrillation (AF), and discarded the remaining groups.

4.2. Experimental Setup

The proposed approach is based on the hierarchical attention networks and has employed three levels of attention. To show the performance of this proposed model, in our experiments, we consider the model without the attention mechanism (denoted as RNN containing just the BiRNNs), one- (denoted as HAN-ECG1), two- (denoted as HAN-ECG2) and three- levels (denoted as HAN-ECG3) of the attention mechanism.

We applied a 10-fold cross-validation approach to evaluate the model. Indeed, we split the dataset into 10 folds. At each round of the cross validation, 9 folds were used for training the model and the remaining fold (1 fold) was used for evaluating the model. At the end, we combined all the evaluation results.

The models were trained with a maximum of 25 epochs and mini-batches of size 64. The Adam optimizer was used to minimize the loss,

with a learning rate . We also used a regularization with a coefficient

and a drop-out technique with a probability of dropping of 0.5 to reduce the effect of the overfitting problem during the training. The number of layers for the BiRNNs ware set to 2. The window and the hop sizes for the last attention layer were set to (2,2) and (5,5) for the MIT-BIH AFIB and AFDB17 databases, respectively. We utilized Python programming language and Google Tensorflow deep learning library to implement the proposed model. We ran the 10-fold cross-validation on a machine with 8 CPUs (Intel(R) Xeon(R) CPU @ 3.60 GHz), 32 GB memory and Ubuntu 18.04. In all experiments, the best performance were reported.

4.3. Results

Quantitative analysis

Table 1 shows the performance of the proposed method with different numbers of employed attention mechanisms against the state-of-the-art algorithms on the MIT-BIH AFIB database with the ECG segment of size 5-s. It can be seen from the table that the proposed method with one, two and three hierarchies achieved significantly better performance against other methods listed in the table. In Table 1, we can observe that the accuracy of the proposed method with two levels of attention is slightly higher than the one with three levels. The reason might be that the ECG segment of size 5-s (as input) has approximately 6 heartbeats in which almost all heartbeats contains the AF arrhythmia. Therefore, the heartbeats windowing at the level three makes no significant improvement in the model performance.

The row number 5 (i.e., the method named HAN-ECG2f) in Table 1 shows the evaluation results of the proposed method while the input ECG signals are split into fixed size portions (here 180 samples for each portion as a heartbeat) and the portions are divided into fixed-size parts (here 6 parts for each portion and each part is considered as a distinc T-wave). From Table 1, it is clear that the proposed method using the fixed size heartbeats and the waves as input results in the lowest performance among all the proposed method variants. Hence, we can conclude that the pre-processing step in our methodology, as shown in Section 3.1, is necessary to obtain better performance. It can also be seen that the RNN method can preform as good as the algorithm provided by Xia et al (Xia et al., 2018) which is a deep convolutional neural network with the stationary wavelet transform (SWT) coefficient time series as input. In addition, Figure 4 illustrates the confusion matrices’ plots to describe a summary of how well the proposed model is performing given all folds.

(a) RNN
(b) HAN-ECG1
(c) HAN-ECG2
(d) HAN-ECG3
Figure 4. Confusion matrices achieved by all the proposed method variants on the MIT-BIH AFIB database.
Best Performance
(lr)3-6 Method Database
HAN-ECG3 AFDB 99.08 99.86
HAN-ECG2 AFDB 98.78 98.83
Xia et al. (2018) (Xia et al., 2018) AFDB
Asgari et al. (2015) (Asgari et al., 2015) AFDB
Lee, et al. (2013) (Lee et al., 2013) AFDB
Jiang et al. (2012) (Jiang et al., 2012) AFDB
Huang et al. (2011) (Huang et al., 2011) AFDB
Babaeizadeh, et al. (2009) (Babaeizadeh et al., 2009) AFDB
Dash et al. (2009) (Dash et al., 2009) AFDB
Tateno et al. (2001) (Tateno and Glass, 2001) AFDB
Table 1. Comparison of performance of the proposed model against other algorithms on the MIT-BIH AFIB database with the ECG segment of size 5-s.

The reported results by our proposed method and other listed algorithms in Table 1 are based on balancing the dataset before training the models, in which the same number of non-AF data samples as the AF data samples are selected randomly. In addition, the selection of the 5-s data segments is from all combined data segments extracted from all individuals. Therefore, the training and evaluation sets can include data segments from the same subjects which is a data leakage problem. To have a more realistic evaluation mechanism, we considered another scenario in which the test and training data segments came from different individuals, and left the dataset imbalanced. Table 2 presents the performance of the proposed AF detectors with the new evaluation scenario on the MIT-BIH AFIB database. Since we could not find any research paper that followed the aforementioned scenario, we just reported our results without any comparison in Table 2. From Table 2, we can again note that the models with more attention layers yield higher accuracy and better performance.

Best Performance
(lr)3-6 Method Database
Table 2. Performance of the proposed model for the AF classification task on the MIT-BIH AFIB database while the database is not balanced and the data segments for the training and test phases come from different ECG recordings.

Table 3 shows the experimental results on the the PhysioNet Computing in Cardiology Challenge 2017 dataset. The overall performance of the proposed models with more than one attention layer (i.e., HAN-ECG2 and HAN-ECG2) is better than other methods, demonstrating the hierarchical attention networks works better for the AF detection task. Since, in this experiment, we considered a two-class problem (AF and normal classes), there was not any work in the literature to report a comparison.

Best Performance
(lr)3-6 Method Database
Table 3. Performance of the proposed model for the AF classification task on the PhysioNet Computing in Cardiology Challenge 2017 dataset (AFDB17).

Qualitative analysis

Understanding the cause of the model decision is very important in healthcare applications. In order to validate that the decisions made by our model are interpretable, we demonstrate through visualising the hierarchical attention layers that the proposed method is considering clinically important heartbeats and waves in detecting the AF arrhythmia. Figures 5 and 6 illustrate a few ECG signals containing the AF and non-AF categories. The top plots of the figures show the original ECG signals and the bottom plots present the informative heartbeats and waves in the detection of a class of interest (AF and non-AF). In the figures, the red segment denote the heartbeat weights and darker ones show more important heartbeats on the network’s decision, and the blue strips and the yellow circles denote the locations of the important waves of the heartbeats in which the darker blue ones show more influence on the detection process.

There are two essential visual features in the patient ECG signals of that the practitioners use to identify the atrial fibrillation: (i) the absence of P-waves that occasionally are replaced by a series of small waves called fibrillation waves, and (ii) the irregular R-R intervals in which the heartbeat intervals are not rhythmic. Figure 5 visualizes the important regions of the ECG signal while it contains AF arrhythmia. From the figure, we can see the importance of the heartbeats (i.e., through the intensity of the red segments), and that the proposed method pays attention to the irregularity of R-R intervals and emphasizes on the absence of P-waves which are the clinical features in recognizing the atrial fibrillation.

In addition, the proposed hierarchical attention mechanism considers the normal heartbeat rhythms for the detection of the non-AF class as shown in Figure (a)a. In order to label an ECG signal as the AF, from Figure (b)b we can again observe that the model is interested in the parts of the ECG signal in which the P-waves are absent (replaced with a series of low-amplitude oscillations). Since all the heartbeats of the 5-s data segments in Figure (a)a and (b)b are either the normal heartbeats or the atrial fibrillation heartbeats, the importance of all the heartbeats approximately is the same (i.e., the same intensity for the red segments). Generally, in all aforementioned figures, our proposed model considers the clinically meaningful waves and their corresponding heartbeats in its decision-making process.

Figure 5. Hierarchical attention visualisation of a subject with the AF arrhythmia from the PhysioNet Computing in Cardiology Challenge 2017 dataset.
(a) Subject without the AF arrhythmia
(b) Subject with the AF arrhythmia
Figure 6. Hierarchical attention visualisation of two subjects from the MIT-BIH AFIB database.

5. Conclusions

In this study, we proposed a hierarchical attention mechanism to accomplish the detection of atrial fibrillation using a signal-lead ECG signal. The proposed approach contains three levels of attention model including the wave, heartbeat and window levels. The attention mechanisms allow us to interpret the detection results with the high resolution. The experiment results on two different database including the the MIT-BIH Atrial fibrillation (MIT-BIH AFIB) and the PhysioNet Computing in Cardiology Challenge 2017 dataset (AFDB17) databases reveal that our method achieves the state-of-the-art performance and outperforms the the existing algorithms. Furthermore, the visualizations show that the three attention levels can consider different weights for various parts the provided signal when assigning a label to the input ECG signal. Moreover, we demonstrated that the pointed artifacts of signals by the model were clinically meaningful. A future research direction is to apply the proposed method to other ECG leads and other arrhythmias to extract new patterns that might be reasonable clinical features in the detection of an arrhythmia.

6. Acknowledgments

This study is based upon work supported by the National Science Foundation under Grant Number 1657260. Research reported in this publication was supported by the National Institute On Minority Health And Health Disparities of the National Institutes of Health under Award Number U54MD012388.


  • U. R. Acharya, S. L. Oh, Y. Hagiwara, J. H. Tan, M. Adam, A. Gertych, and R. San Tan (2017) A deep convolutional neural network model to classify heartbeats. Computers in biology and medicine 89, pp. 389–396. Cited by: §1, §2.
  • S. Asgari, A. Mehrnia, and M. Moussavi (2015)

    Automatic detection of atrial fibrillation using stationary wavelet transform and support vector machine

    Computers in biology and medicine 60, pp. 132–142. Cited by: §1, §4.1, Table 1.
  • S. Babaeizadeh, R. E. Gregg, E. D. Helfenbein, J. M. Lindauer, and S. H. Zhou (2009) Improvements in atrial fibrillation detection for real-time monitoring. Journal of electrocardiology 42 (6), pp. 522–526. Cited by: Table 1.
  • S. Dash, K. Chon, S. Lu, and E. Raeder (2009) Automatic real time detection of atrial fibrillation. Annals of biomedical engineering 37 (9), pp. 1701–1709. Cited by: Table 1.
  • H. Fujita and D. Cimr (2019) Computer aided detection for fibrillations and flutters using deep convolutional neural network. Information Sciences 486, pp. 231–239. Cited by: §2.
  • C. D. Furberg, B. M. Psaty, T. A. Manolio, J. M. Gardin, V. E. Smith, P. M. Rautaharju, C. C. R. Group, et al. (1994) Prevalence of atrial fibrillation in elderly subjects (the cardiovascular health study). The American journal of cardiology 74 (3), pp. 236–241. Cited by: §1.
  • J. Gao, H. Zhang, P. Lu, and Z. Wang (2019) An effective lstm recurrent network to detect arrhythmia on imbalanced ecg dataset. Journal of healthcare engineering 2019. Cited by: §2.
  • A. Ghaffari and N. Madani (2019)

    Atrial fibrillation identification based on a deep transfer learning approach

    Biomedical Physics & Engineering Express 5 (3), pp. 035015. Cited by: §1, §2.
  • C. Huang, S. Ye, H. Chen, D. Li, F. He, and Y. Tu (2011) A novel method for detection of the transition between atrial fibrillation and sinus rhythm. IEEE Transactions on Biomedical Engineering 58 (4), pp. 1113–1119. Cited by: Table 1.
  • K. Jiang, C. Huang, S. Ye, and H. Chen (2012) High accuracy in automatic detection of atrial fibrillation for holter monitoring. Journal of Zhejiang University SCIENCE B 13 (9), pp. 751–756. Cited by: Table 1.
  • T. J. Jun, H. M. Nguyen, D. Kang, D. Kim, D. Kim, and Y. Kim (2018) ECG arrhythmia classification using a 2-d convolutional neural network. arXiv preprint arXiv:1804.06812. Cited by: §2.
  • M. Kachuee, S. Fazeli, and M. Sarrafzadeh (2018) Ecg heartbeat classification: a deep transferable representation. In 2018 IEEE International Conference on Healthcare Informatics (ICHI), pp. 443–444. Cited by: §2.
  • D. E. Lake and J. R. Moorman (2010)

    Accurate estimation of entropy in very short physiological time series: the problem of atrial fibrillation detection in implanted ventricular devices

    American Journal of Physiology-Heart and Circulatory Physiology 300 (1), pp. H319–H325. Cited by: §1.
  • J. Lee, B. A. Reyes, D. D. McManus, O. Maitas, and K. H. Chon (2013) Atrial fibrillation detection using an iphone 4s. IEEE Transactions on Biomedical Engineering 60 (1), pp. 203–206. Cited by: §1, Table 1.
  • E. P. Lehman, R. G. Krishnan, X. Zhao, R. G. Mark, and L. H. Lehman (2018) Representation learning approaches to detect false arrhythmia alarms from ecg dynamics. Proceedings of machine learning research 85, pp. 571. Cited by: §2.
  • F. Ma, R. Chitta, J. Zhou, Q. You, T. Sun, and J. Gao (2017) Dipole: diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1903–1911. Cited by: §1.
  • S. Mousavi, F. Afghah, and U. R. Acharya (2019a) SleepEEGNet: automated sleep stage scoring with sequence to sequence deep learning approach. PloS one 14 (5). Cited by: §1.
  • S. Mousavi, F. Afghah, A. Razi, and U. R. Acharya (2019b) ECGNET: learning where to attend for detection of atrial fibrillation with deep visual attention. In 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), pp. 1–4. Cited by: §2.
  • S. Mousavi and F. Afghah (2019) Inter-and intra-patient ecg heartbeat classification for arrhythmia detection: a sequence to sequence deep learning approach. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1308–1312. Cited by: §2.
  • S. Mousavi, A. Fotoohinasab, and F. Afghah (2020) Single-modal and multi-modal false arrhythmia alarm reduction using attention-based convolutional and recurrent neural networks. PloS one 15 (1), pp. e0226990. Cited by: §2.
  • S. Mousavi, M. Schukat, E. Howley, A. Borji, and N. Mozayani (2016a) Learning to predict where to look in interactive environments using deep recurrent q-learning. arXiv preprint arXiv:1612.05753. Cited by: §1.
  • S. Mousavi, M. Schukat, and E. Howley (2018) Researching advanced deep learning methodologies in combination with reinforcement learning techniques. Ph.D. Thesis, Master’s thesis, National University of Ireland Galway. Cited by: §1.
  • S. S. Mousavi, M. Schukat, E. Howley, and P. Mannion (2017a) Applying q ()-learning in deep reinforcement learning to play atari games. In AAMAS Adaptive Learning Agents (ALA) Workshop, Cited by: §1.
  • S. S. Mousavi, M. Schukat, and E. Howley (2016b) Deep reinforcement learning: an overview. In Proceedings of SAI Intelligent Systems Conference, pp. 426–440. Cited by: §1.
  • S. S. Mousavi, M. Schukat, and E. Howley (2017b) Traffic light control using deep policy-gradient and value-function-based reinforcement learning. IET Intelligent Transport Systems 11 (7), pp. 417–423. Cited by: §1.
  • J. Pan and W. J. Tompkins (1985) A real-time qrs detection algorithm. IEEE Trans. Biomed. Eng 32 (3), pp. 230–236. Cited by: item 3.
  • PhysioNet (2000) PhysioNet MIT-BIH Atrial Fibrillation Database. External Links: Link Cited by: §4.1.
  • PhysioNet (2001) AF Classification from a Short Single Lead ECG Recording - The PhysioNet Computing in Cardiology Challenge 2017. External Links: Link Cited by: §4.1.
  • P. Rajpurkar, A. Y. Hannun, M. Haghpanahi, C. Bourn, and A. Y. Ng (2017) Cardiologist-level arrhythmia detection with convolutional neural networks. arXiv preprint arXiv:1707.01836. Cited by: §1.
  • S. Saadatnejad, M. Oveisi, and M. Hashemi (2019) LSTM-based ecg classification for continuous monitoring on personal wearable devices. IEEE journal of biomedical and health informatics. Cited by: §2.
  • M. Schuster and K. K. Paliwal (1997) Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45 (11), pp. 2673–2681. Cited by: §3.
  • A. Shamsoshoara, F. Afghah, A. Razi, S. Mousavi, J. Ashdown, and K. Turk (2019a) An autonomous spectrum management scheme for unmanned aerial vehicle networks in disaster relief operations. arXiv preprint arXiv:1911.11343. Cited by: §1.
  • A. Shamsoshoara, M. Khaledi, F. Afghah, A. Razi, J. Ashdown, and K. Turck (2019b) A solution for dynamic spectrum management in mission-critical uav networks. In 2019 16th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON), pp. 1–6. Cited by: §1.
  • A. Shamsoshoara, M. Khaledi, F. Afghah, A. Razi, and J. Ashdown (2019c) Distributed cooperative spectrum sharing in uav networks using multi-agent reinforcement learning. In 2019 16th IEEE Annual Consumer Communications & Networking Conference (CCNC), pp. 1–6. Cited by: §1.
  • S. P. Shashikumar, A. J. Shah, G. D. Clifford, and S. Nemati (2018) Detection of paroxysmal atrial fibrillation using attention-based bidirectional recurrent neural networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 715–723. Cited by: §2.
  • K. Tateno and L. Glass (2001) Automatic detection of atrial fibrillation using the coefficient of variation and density histograms of rr and rr intervals. Medical and Biological Engineering and Computing 39 (6), pp. 664–671. Cited by: Table 1.
  • Y. Xia, N. Wulan, K. Wang, and H. Zhang (2018) Detecting atrial fibrillation by deep convolutional neural networks. Computers in biology and medicine 93, pp. 84–92. Cited by: §1, §1, §2, §4.1, §4, Table 1.
  • K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio (2015) Show, attend and tell: neural image caption generation with visual attention. In International conference on machine learning, pp. 2048–2057. Cited by: §1.
  • Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy (2016) Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp. 1480–1489. Cited by: §1, §3.
  • Ö. Yıldırım, P. Pławiak, R. Tan, and U. R. Acharya (2018) Arrhythmia detection using deep convolutional neural network with long duration ecg signals. Computers in biology and medicine 102, pp. 411–420. Cited by: §1, §2.
  • M. Zaeri-Amirani, F. Afghah, and S. Mousavi (2018)

    A feature selection method based on shapley value to false alarm reduction in icus a genetic-algorithm approach

    In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 319–323. Cited by: §1.
  • X. Zhou, H. Ding, B. Ung, E. Pickwell-MacPherson, and Y. Zhang (2014) Automatic online detection of atrial fibrillation based on symbolic dynamics and shannon entropy. Biomedical engineering online 13 (1), pp. 18. Cited by: §1.