1 Introduction
The electrocardiogram (ECG) is a routine test for clinical medicine. It plays a crucial role in the diagnosis of a broad range of anomalies in the human heart; from arrythmias to myocardial infarction.
The widely available digital ECG data has facilitated the development of algorithms for ECG processing and interpretation. In particular, the literature for computerized arrhythmia detection and classification is extensive. Useful review matterial LSC16 ; LMM18 can help with the introduction to state of the art techniques, which nonetheless keeps growing AFA17 ; HRH19 ; ANP18 ; LYM17 ; Pla18 .
A common first step in ECG modeling consists in reducing the dimensionality of the signal. This entails to represent the informational content of the record by means of significantly fewer parameters than the number of samples in the digital ECG. When the aim is to reproduce the original signal at low level distortion, the step is frequently realized through transformations such as the Wavelet Transform and the Discrete Cosine Transform. In the last few years alternative approaches, falling within the category of sparse representation of ECG signals, have been considered.
Within the sparse representation framework, an ECG record is represented as a linear combination of elementary components, called atoms, which are selected from a redundant set, called a dictionary. The success of the methods developed within this framework depends on both, the selection technique and the proposed dictionary. The selection techniques which are widely applied for sparse representation of general signals are either greedy pursuit strategies MZ92 ; PRK93 ; RNL02 , or strategies based on minimization of the 1norm as a cost function CDS98 . Suitable dictionaries depend on the class of signals being processed. These can be designed at hoc or be learned from training data. Sparse representation of ECG signals has been tackled by both these approaches, e.g. AGL15 learns dictionaries using some part of the records for ECG compression and RR18 uses Gabor dictionaries for structuring features for classification.
In a recent publication RNC19 we have shown that wavelet dictionaries, derived from known wavelet families, are suitable for representing an ECG record as a linear combination of fewer elementary components than those required by a wavelet basis. The model was shown to be successful for dimensionality reduction and lossy compression. As far as compression is concerned the method advanced in RNC19 produces compression results improving upon previously reported benchmarks LKL11 ; MZD15 ; PSS16 ; TZW18 for the MITBIH Arrhythmia data set without preprocessing. With regard to dimensionality reduction, wavelet dictionaries considerably improve upon the results achieved with the wavelet basis of the same family RNC19 ; LRN19 . This result motivated the present Communication. While in RNC19 the dictionaries have been used to demonstrate their suitability for dimensionality reduction of ECG signals at low level distortion, the details of their numerical construction were not given. This paper complements the previous work by presenting the algorithms for building dictionaries from the following mother wavelet prototypes:

ChuiWang linear spline wavelet CW92

ChuiWang quadratic spline wavelet CW92

ChuiWang cubic spline wavelet CW92

primal CDF97 wavelet CDF92

dual CDF97 wavelet CDF92

primal CDF53 wavelet CDF92

Daubechies wavelet with vanishing moments Daub88

Daubechies wavelet with vanishing moments Daub88

Daubechies wavelet with vanishing moments Daub88

symlet with vanishing moments Daub93

symlet with vanishing moments Daub93

symlet with vanishing moments Daub93

coiflet with 2 vanishing moments and support of length 6 which is the most regular Daub93

coiflet with 3 vanishing moments and support of length 8 Daub93
The method proposed in RNC19 for modelling a given ECG signal proceeds as follows. Assuming that the signal is given as an dimensional array, this array is partitioned into cells . Thus, each cell is an
dimensional vector, which is modeled by an atomic decomposition of the form
(1) 
For each cell , the atoms are selected from a dictionary through the greedy Optimized Orthogonal Matching Pursuit (OOMP) algorithm RNL02 ; LRN16 . The array is a vector whose components contain the indices of the selected atoms for decomposing the th cell in the signal partition. The OOMP method, for selecting these indices and computing the corresponding coefficients in (1), is fully implemented by the OOMP function included as a tool of the software.
Each of the proposed dictionaries consists of two components. One of the components contains a few elements, say , from a discrete cosine basis. This component of the dictionary allows for the fact that ECG signals are normally superimposed to a smooth background. It is given as a matrix . The other component is the waveletbased dictionary, which is given as a matrix . Thus, the whole dictionary is an matrix obtained by the horizontal concatenation of and . The next section is dedicated to the construction of .
The paper is organized as follows. Sec. 2 gives all the details for the construction of different wavelet prototypes and the concomitant wavelet dictionaries generated by those prototypes. Secs 3 and 4 deliver details and examples demonstrating the use of the MATLAB software for modelling ECG signals within the proposed framework.
The software has been made available on a dedicated webpage RNC19a . The implementation allows for straightforward further extension of the options for wavelet types.
2 Method
In this section we produce all the pseudocodes for the construction of wavelets dictionaries, which can be used to achieve the model of every segment in a signal partition. As already mentioned, each dictionary is obtained by taking the prototypes from a wavelet basis and translating them within a shorter step than that corresponding to the wavelet basis.
Throughout the paper we adopt the following notation. Boldface fonts are used to indicate Euclidean vectors and matrices. Standard mathematical fonts are used to indicate components, e.g., is a vector of components and is a matrix of elements . The symbol denotes the space of square integrable functions.
Wavelets are usually constructed starting from a multiresolution analysis, which is a sequence of closed subspaces of the space which are nested and their union is dense in , i.e.,
(2) 
We assume that there exists a function such that for functions
(3) 
form uniformly stable bases of the spaces , i.e., the bases are Riesz bases with bounds independent of the level , see e.g. Cohen2003 . The functions are called scaling functions and the function is called a generator of scaling functions. Next we present a method for the actual construction of the scaling functions.
2.1 Generation of scaling functions
We assume that has a compact support for some . From the nestedness of the multiresolution spaces , it follows that there exists a scaling filter such that
(4) 
If then, integrating (4), we obtain
(5) 
which implies that has to be normalized such that
(6) 
The scaling equation (4) enables computing values of the scaling function at points for , . First we compute values of at integer points. Since , we have for . Let us define a vector
(7) 
where the indicates the transpose operation. Substituting into (4), we obtain
We set for and and define a matrix by
(9) 
Then, (2.1) is equivalent to
(10) 
This means that
is an eigenvector corresponding to the eigenvalue
of the matrix . If the multiplicity of this eigenvalue is , then is given uniquely up to a multiplication by a constant. Our aim is to compute a vector such that(11) 
for a chosen level . From (7) and (11) we have
(12) 
We compute values of at points . Note that for even we already know these values. Using (4) and (12) we obtain
for . Similarly, we compute values of at points , and thus we continue until we determine values at points . More precisely, for we assume that we know values of at , and we compute the values
(14) 
Using (4) we obtain
Remark 1
Some scaling functions such as spline scaling functions are known in an explicit form and their values can be evaluated directly. However, an advantage of our approach is that it is more general and can be used for a large class of wavelet families.
2.2 Construction of wavelet generators from scaling functions
Let be complement spaces such that , where denotes a direct sum. Wavelet functions are constructed in the form:
(16) 
to be a basis for and such that
(17) 
called a wavelet basis, is a Riesz basis of the space .
Since there exists a vector such that
(18) 
The vector is called a wavelet filter. From (18) we have
(19) 
In Algorithm 1 we compute a vector such that
(20) 
in the following way. Due to (18) and (20), we have
(21)  
The sum in the last equation is computed as a cyclic sum. For
(22) 
we set and for we do
(23) 
if . Using the substitution
(24) 
for , we obtain
(25) 
Algorithm 1 computes vectors and for given scaling and wavelet filters. The filters corresponding to the wavelet families supported by the software are given in Appendix A (Algorithm 8).
scaling filter  
wavelet filter  
level (integer) that determines points 
2.3 Construction of wavelet bases and dictionaries
Hereafter we drop all normalization factors and normalize all the vectors once they have been constructed. Note that in (17) we used a translation parameter and since is a Riesz basis the functions from are linearly independent. Now, we choose a parameter such that for some integer . We define functions
(26) 
and
(27) 
which form a redundant dictionary ARN05 ; ARN08 ; RNX10 . Obviously, corresponds to a basis.
The left graph of Figure 1 shows two consecutive wavelet functions taken from a linear spline bases CW92 . The right graph of Figure 1 corresponds to two consecutive wavelet functions taken from the dictionary spanning the same space which corresponds to .
Algorithm 2 constructs a discrete dictionary, i.e., a matrix which contains values of functions from (26) and (27) at equidistant points for some chosen levels determined by the vector . Since Algorithm 1 enables us to construct values at points of the form , we evaluate functions (26) and (27) at the points
(28) 
where denotes the smallest integer number larger than .
For a chosen vector of levels , we define a vector of indices such that is the number of scaling functions at level , and is the number of wavelets at level for , where is the length of . We have
(29) 
Comparing the supports of these functions and the interval
(30) 
which contains the points from (28), we find that the number of inner scaling functions, i.e., scaling functions with the whole support in , is
(31) 
where the symbol denotes the largest integer number smaller than . The number of left boundary scaling functions, i.e., functions that have only a part of the support in the interior of and their support contains , is
(32) 
and similarly the number of right boundary scaling functions is
(33) 
Hence, we have
(34) 
Similarly, the number of wavelet functions on the level is
(35) 
The first columns of contain values of scaling functions (26), which restricted to are not identically zero, at points given in (28), i.e.,
(36) 
for , . The above equation can be recast:
where is defined by (11) for the level . Using the substitution , we obtain
(38) 
under the assumption that .
The other columns of contain values of wavelet functions (27) for levels at points (28), i.e.,
(39) 
for , , and . Similarly as above we obtain
(40) 
where and is defined by (20) for the level .
The following procedure WaveletDict computes a wavelet dictionary.
namef  name of a wavelet family, for available choices see Appendix A 
number of points  
vector of levels  
translation factor for some integer 
wavelet dictionary  
is the number of scaling functions at level , and for is the number of wavelets at level  
cell array such that if the th column of corresponds to values of a scaling function or a wavelet ; type=‘inner’ or ‘boundary’ characterizes type of a function; function=‘scaling’ or ‘wavelet’ indicates whether the column corresponds to the values of a scaling function or a wavelet 
The main procedure GenDict validates input parameters, generates dictionaries and normalizes their columns.
name  name of a wavelet family, for available choices see Appendix A 
pars  parameters in the form pars = 
number of points  
vector of levels  
translation factor for some integer 
wavelet dictionary  
is the number of scaling functions at level , and for is the number of wavelets at level  
cell array such that , if the th column of corresponds to values of scaling function or wavelet ; type=‘inner’ or ‘boundary’ characterizes type of a function; function=‘scaling’ or ‘wavelet’ indicates whether the column corresponds to the values of a scaling function or a wavelet 
Remark 2
It is worth remarking that the range of scales, say depends on length of the signal partition. For a signal segment of length a dictionary contains values of scaling functions and wavelets at points for some integer . For a signal segment of length a dictionary contains values of functions at points . Thus we have
(41) 
and
(42) 
Therefore, nonzero elements of vectors on the level in a dictionary for correspond to nonzero elements of vectors on the level in a dictionary for . This situation is illustrated in Figure 2, where vectors of values are displayed for and and for and . Note that the nonzero elements in these vectors are the same. Therefore, if for the signal segment of length the vector is used, then we recommend to use the vector for the signal segment of length , and similarly to use levels for the signal segment of length .
To build dictionaries for the wavelet family ‘Short3’ at levels 2 and 3, for translation parameter , and the number of points , use the procedure TestDict below.
The output is the matrix of size and the vector . This means that there are scaling functions at level 2, 27 wavelets at level 2, and 43 wavelets at level 3. The cell array characterizes functions corresponding to columns of . For example
(43) 
which means that th column of the matrix contains values of a wavelet function . This wavelet is a boundary wavelet, i.e., only a part of its support lies in the interval defined by (30). Some of the vectors from this dictionary corresponding to values of scaling functions are displayed in Figure 3 and some of the vectors corresponding to values of wavelets are displayed in Figure 4.
2.4 Construction of dictionaries for ECG modelling
As mentioned in Sec. 1, because ECG signals are usually superimposed to a baseline or smooth background, the full dictionary we use for ECG modelling is built as follows
(44) 
where is the output of Algorithm 5 and is a matrix containing a few low frequency components from a discrete cosine basis. Before normalization is given as
(45) 
where is a small number in comparison to . For the numerical examples of the next section we consider . Algorithm 5 computes .
the size of the Euclidean space the vectors should belong to  
number of frequencies to use starting from 0 
matrix whos columns are discrete cosine vectors 
2.5 Method for construction of the model
In this section we present the procedures for constructing the ECG signal model (c.f. Algorithm 6) and for calculating the assessment metrics. The quality of the signal approximation is assessed with respect to the defined as follows
(46) 
where is the original signal and is the signal reconstructed by concatenation of the approximated segments .
The local with respect to every segment in the signal partition is indicated as and calculated as
(47) 
For the signal approximation the OOMP method is stopped through a fixed value so as to achieve the same value of for all the segments in the records. Assuming that the target before quantization is we set .
The goal of the signal model is to approximate each segment in the signal partition using as few atoms as possible. Thus, for a fixed value of , the sparsity of the signal representation is assessed by the sparsity ratio (SR)
(48) 
where is the total length of the signal and with the number of atoms in the atomic decomposition (1) of each segment of length . The corresponding quantity evaluated for every cell in the partition is the local sparsity ratio
(49) 
This local quantity is relevant to the detection of nonstationary noise, significant distortion in ECG patterns, or changes of morphology in the heart beats.
Given an ECG signal the procedure described in Algorithm 6 constructs the signal approximation, , using the dictionaries introduced in the previous section.
signal  
number of points in each segment of the partition  
parameter to control the approximation error  
name of a wavelet family  
parameters as described in Algorithm 3  
number of components in the cosine subdictionary 
approximated signal  
Comments
There are no comments yet.