DeepAI
Log In Sign Up

A matrix-free high-order solver for the numerical solution of cardiac electrophysiology

We propose a matrix-free solver for the numerical solution of the cardiac electrophysiology model consisting of the monodomain nonlinear reaction-diffusion equation coupled with a system of ordinary differential equations for the ionic species. Our numerical approximation is based on the high-order Spectral Element Method (SEM) to achieve accurate numerical discretization while employing a much smaller number of Degrees of Freedom than first-order Finite Elements. We combine sum-factorization with vectorization, thus allowing for a very efficient use of high-order polynomials in a high performance computing framework. We validate the effectiveness of our matrix-free solver in a variety of applications and perform different electrophysiological simulations ranging from a simple slab of cardiac tissue to a realistic four-chamber heart geometry. We compare SEM to SEM with Numerical Integration (SEM-NI), showing that they provide comparable results in terms of accuracy and efficiency. In both cases, increasing the local polynomial degree p leads to better numerical results and smaller computational times than reducing the mesh size h. We also implement a matrix-free Geometric Multigrid preconditioner that entails better performance in terms of linear solver iterations than state-of-the-art matrix-based Algebraic Multigrid preconditioners. As a matter of fact, the matrix-free solver here proposed yields up to 50× speed-up with respect to a conventional matrix-based solver.

READ FULL TEXT VIEW PDF

page 21

page 23

02/06/2021

High Order Numerical Homogenization for Dissipative Ordinary Differential Equations

We propose a high order numerical homogenization method for dissipative ...
09/10/2021

An Efficient High-order Numerical Solver for Diffusion Equations with Strong Anisotropy

In this paper, we present an interior penalty discontinuous Galerkin fin...
03/04/2022

Low-order preconditioning for the high-order finite element de Rham complex

In this paper we present a unified framework for constructing spectrally...
11/08/2021

A high order differential equation based wall distance solver

A computationally efficient high-order solver is developed to compute th...
10/27/2022

A matrix-free ILU realization based on surrogates

Matrix-free techniques play an increasingly important role in large-scal...
04/03/2021

Semi matrix-free twogrid shifted Laplacian preconditioner for the Helmholtz equation with near optimal shifts

Due to its significance in terms of wave phenomena a considerable effort...
05/11/2021

A Hermite Method with a Discontinuity Sensor for Hamilton-Jacobi Equations

We present a Hermite interpolation based partial differential equation s...

1 Introduction

Mathematical and numerical modeling of cardiac electrophysiology provides meaningful tools to address clinical problems in silico, ranging from the cellular to the organ scale Gerach_2021; Gray_2018; Piersanti_2022; Potse_2014; Strocchi_2020. For this reason, several mathematical models and methods have been designed to perform electrophysiological simulations Quarteroni_2019; Trayanova_2011. Among these, we consider the monodomain equation coupled with suitable ionic models, which describes the space–time evolution of the transmembrane potential and how chemical species move across ionic channels ColliFranzone_2014.

This set of combined partial and ordinary differential equations describes solutions that resemble those of a wavefront propagation problem, i.e. manifesting very steep gradients. Despite being extensively used Arevalo_2016; Bayer_2022; Gillette_2021; MendoncaCosta_2019, the Finite Element Method (FEM) with first order polynomials does not seem to be the most suitable to properly capture the physical processes underlying cardiac electrophysiology quarteroni_1994. Indeed, in such cases, a very fine mesh resolution is required to obtain fully convergent numerical results Woodworth_2021, which calls for an overwhelming computational burden.

High–order numerical methods come into play to tackle this specific issue: Spectral Element Method (SEM) patera_1984; maday_1989; chqz07, high–order Discontinuous Galerkin (DG) Arnold-2001; Cockburn-1998, Finite Volume Method (FVM) Leveque_2002, or Isogeometric Analysis (IGA) chb_iga_book account for small numerical dispersion and dissipation errors while allowing for converging numerical solutions with less Degrees of Freedom (DOFs) Bucelli_2021; Cantwell_2014; Coudiere_2017; Hoermann-2018. However, the use of high–order polynomials in matrix–based solvers for complex scenarios has been hampered by several numerical challenges, which are mostly related to the stiffness of the discretized monodomain problem Vincent_2015.

In this context, we develop and implement a high–order matrix–free numerical solver that can be readily employed for CPU–based, massively parallel, large–scale numerical simulations. Since there is no need to assemble any matrix, all the floating point operations are associated with matrix–vector products that represent the most demanding computational kernels at each iteration of iterative solvers. Thanks to vectorization Arndt_2020, which enables algebraic operations on multiple mesh cells at the same time, and sum–factorization Orszag_1980; Melenk_2001, the higher the polynomial degree, the higher the computational advantages provided by the matrix–free solver Cantwell_2014; Kronbichler_2012. Moreover, the small memory occupation required by the matrix–free implementation allows for its exploitation in GPU–based cardiac solvers Xia_2015; kronbichler_multigrid; DelCorso_2022.

In this manner, we obtain very accurate and efficient numerical simulations for cardiac electrophysiology, even if the linear solver remains unpreconditioned. Additionally, we implement a matrix–free Geometric Multigrid (GMG) preconditioner that is optimal for the values of (mesh size) and (polynomial degree) considered in this paper when continuous model properties (i.e a single ionic model and a continuous set of conductivity coefficients) are employed throughout the computational domain.

We present different benchmark problems of increasing complexity for cardiac electrophysiology, ranging from the Niederer benchmark on a slab of cardiac tissue niederer2011 to a whole–heart numerical simulation. We focus on two high–order discretization methods, namely, we compare SEM to SEM with Numerical Integration (SEM–NI), following the notations introduced in chqz07. These two methods differ in the use of quadrature formulas: Legendre–Gauss formulae for SEM, Legendre–Gauss–Lobatto for SEM–NI. Numerical results of Section 5 show that the two methods feature a similar behaviour in terms of both accuracy and computational costs. In both cases, choosing a higher polynomial degree leads to a fairly more beneficial ratio between accuracy and computational costs than reducing the mesh size . For instance, working with two discretizations with the same number of DOFs on the Niederer benchmark, the solution computed with (local polynomials of degree 4 with respect to each spatial coordinate) and average mesh size mm is more accurate than the one obtained with and average mesh size mm. Moreover, the former one has been computed at a computational cost that is about 40% of the latter one.

We also evaluate the performance of our matrix–free solver: a 50 speed–up is achieved with respect to the matrix–based solver. Furthermore, while the assembling and solving phases of the monodomain problem take more than 70% of the total computational time with the matrix–based implementation, this value drops to approximately 20% with the matrix–free solver.

The mathematical models and the numerical methods contained in this paper have been implemented in lifex (https://lifex.gitlab.io/), a high-performance C++ library developed within the iHEART project and based on the deal.II (https://www.dealii.org) Finite Element core dealII92.

The outline of the paper is as follows. We describe the monodomain model in Section 2. We address its space and time discretizations in Section 3. We propose the matrix–free solver for cardiac electrophysiology and the matrix–free GMG preconditioner in Section 4. Finally, the numerical results in Section 5 prove the high efficiency of our high–order SEM matrix–free solver against the low–order FEM matrix–based one.

2 Mathematical model

For the mathematical modeling of cardiac electrophysiology, we consider the monodomain equation coupled with suitable ionic models (Quarteroni_2019; ColliFranzone_2014)

(1)

The unknowns are: the transmembrane potential , the vector

of the probability density functions of

gating variables, which represent the fraction of open channels across the membrane of a single cardiomyocyte, and the vector of the concentrations of ionic species. For the sake of simplifying the notation, in the following the membrane capacitance per unit area and the membrane surface–to–volume ratio are set equal to .

The mathematical expressions of the functions and , which describe the dynamics of gating variables and ionic concentrations respectively, and the ionic current strictly depend on the choice of the ionic model. Here, the TTP06 tentusscher_2006 ionic model is adopted for the slab and ventricular geometries, while the CRN courtemanche1998 ionic model is employed for the atria. The action potential is triggered by an external applied current .

The diffusion tensor

is expressed as follows

(2)

where the vector fields , and express the fiber, the sheetlet and the sheet–normal (cross–fiber) directions, respectively. We also define longitudinal, transversal and normal conductivities as , respectively Piersanti_2021.

In this paper, the computational domain is represented either by a slab of cardiac tissue or by the Zygote geometry Zygote. Homogeneous Neumann boundary conditions are prescribed on the whole boundary to impose the condition of electrically isolated domain, being the outward unit normal vector to the boundary.

3 Space and time discretizations

In order to discretize in space the system (1), we adopt SEM patera_1984; maday_1989; chqz07; bernardi2001; karniadakis_sherwin, a high-order method that can be recast in the framework of the Galerkin method quarteroni_1994.

We consider a family of hexahedral conforming meshes, satisfying standard assumption of regularity and quasi-uniformity quarteroni_1994, and let denote the mesh size.

At each time, we look for the discrete solution belonging to the space of globally continuous functions that are the tensorial product of univariate piecewise (on each mesh element) polynomial functions of local degree with respect to each coordinate. The local finite element space is referred to as , while we denote by the global finite dimensional space.

When using SEM, the univariate basis functions are of Lagrangian (i.e., nodal) type and their support nodes are the Legendre–Gauss–Lobatto quadrature nodes (see, e.g., (chqz06, Ch. 2)), suitably mapped from the reference interval to the local 1D elements.

One of the main features of SEM is that, when the data are smooth enough, the induced approximation error decays more than algebraically fast with respect to the local polynomial degree. Indeed, it is said that SEM features exponential or spectral convergence. At the same time, the convergence with respect to the mesh size behaves as in FEM. More precisely, if , with , denotes the exact solution of a linear second–order elliptic problem in a Lipschitz domain and

is its SEM approximation, the following error estimate holds

(3)

SEM can be considered as a special case of FEM (karniadakis_sherwin; szabo_babuska_1991; schwab_1998) with nodal basis functions and conforming hexahedral meshes.

Typically, when using SEM, the integrals appearing in the Galerkin formulation of the differential problem (1) are computed by the composite Legendre–Gauss (LG) quadrature formulas (see chqz07; chqz06). In principle, one can choose LG formulas of the desired order of exactness to guarantee a highly accurate computation of all the integrals appearing in (1). However, a typical choice is to use LG formulas with quadrature nodes, which guarantees that the entries of both the mass matrix and the stiffness matrix with constant coefficients are computed exactly while keeping the computational costs not too large Kronbichler_2012; Fehn_2019.

A considerable improvement in reducing the computational times of evaluating the integrals consists of using Legendre–Gauss–Lobatto (LGL) quadrature formulas (instead of LG ones), again with nodes that now coincide with the support nodes of the Lagrangian basis functions.

This results into the so–called SEM–NI method (NI standing for numerical integration). Since the Lagrangian basis functions are mutually orthogonal with respect to the discrete inner product induced by the LGL formulas, the mass matrix of the SEM–NI method is diagonal, although not integrated exactly; this is a great strength of SEM–NI in solving time–dependent differential problems through explicit methods when the mass matrix is assembled. On the other hand, as the degree of exactness of LGL quadrature formulas using nodes along each direction is , the integrals associated with the nonlinear terms of the differential problem may introduce quadrature errors and aliasing effects that are as significant as the nonlinearities.

We remark that SEM is equivalent to FEM, while SEM–NI is in fact FEM in which the integrals are approximated by the trapezoidal quadrature rule quarteroni_1994.

We choose the same local polynomial degree (and then the same finite dimensional space) for approximating the transmembrane potential , the gating variables (for ) and the ionic concentrations (for ) at each time .

All the time derivatives appearing in Eq. (1) have been approximated by the 2nd–order Backward Differentiation Formula (BDF2) over a discrete set of time steps , being the time step size.

Regardless of the quadrature formula (LG or LGL), the algebraic counterpart of the monodomain problem (1) reads: given , and , and suitable initializations for , and , then, for any , find , and by solving the following partitioned scheme:

(4)

The arrays and and contain the SEM or SEM–NI DOFs of the transmembrane potential, gating variables and ionic concentrations, respectively, and are the SEM or SEM–NI mass and stiffness matrices, respectively, and . The entries of

are computed with Ionic Current Interpolation (ICI)

Regazzoni_2022, i.e.,

(5)

with the Lagrange basis function of the finite dimensional space . We remark that, when SEM–NI with LGL quadrature formulas are employed, ICI coincides with Lumped–ICI quarteroni2017, as the lumping of the SEM mass matrix coincides with the SEM–NI mass matrix.

If we set , then we recover the fully implicit BDF2 scheme. Nevertheless, we highlight that the function is typically strongly nonlinear. To overcome the drawbacks of this nonlinearity, we adopt the extrapolation formula of , that is second–order accurate with respect to . The resulting semi–implicit scheme is 2nd–order accurate in time when (see, e.g., gervasio2006).

The ordinary differential equations (4) are associated with the ionic model and provide both the gating variables and the ionic species, while equation (4) is the discretization of the monodomain equation and its solution at the generic time step is obtained by solving the linear system

(6)

where

(7)

Solving the system (6) represents the most computationally demanding part of (4). We refer to Section 5, in particular to Table 8, for further details about this specific aspect.

4 Matrix–free and matrix–based solvers

As in FEM, the matrix based on either SEM or SEM–NI has a very sparse structure, thus iterative methods are the natural candidates to solve the linear system (6). Since is symmetric and positive definite, we have adopted the Conjugate Gradient (CG) method or its preconditioned version (PCG).

Excluding the preconditioner step, the most expensive part of one CG–iteration is the evaluation of a matrix–vector product , where is a given vector.

Typically, in a conventional matrix–based solver, the matrix is assembled and stored in sparse format, then referenced whenever the matrix–vector product has to be evaluated, i.e. during each iteration of the CG. The matrix–based solver aims at minimizing the number of floating points operations required for such evaluation and is a winning strategy in FEM discretization for which the band of the matrix is small.

When SEM (or SEM–NI) of local degree are employed in the discretization process, each cell counts DOFs. It follows that the typical bandwidth of SEM (or SEM–NI) stiffness matrices is about (where is the maximum number of cells sharing one node of the mesh) and it exceeds widely that of FEM stiffness matrices. The large bandwidth of the SEM matrix can worsen the computational times of accessing the matrix entries, thus deteriorating the efficiency of the iterative solver.

Moreover, in modern processors, access to the main memory has become the bottleneck in many solvers for partial differential equations: a matrix–vector product based on matrices requires far more time waiting for data to arrive from memory than on actually doing the floating point operations. Thus, it is proved to be more efficient to recompute matrix entries – or rather, the action of the differential operator represented by these entries on a known vector, cell by cell – rather than looking up matrix entries in the memory, even if the former approach requires a significant number of additional floating point operations

Kronbichler_2012.

This approach is referred to as matrix–free. In practice, shape functions values and gradients are pre-computed for each basis function on the reference cell, for each quadrature node. Then, the Jacobian of the transformation from the real to the reference cell is cached, thus improving the computational cost of the evaluation.

A matrix–free solver can also benefit from vectorization, that brings an additional speedup. In FEM solvers (and, similarly, in SEM ones), the cell–wise computations are typically exactly the same for all cells, and hence a Single–Instruction, Multiple–Data (SIMD) stream can be used to process several values at once (see Figure 1). Vectorization is a SIMD concept, that is, one CPU instruction is used to process multiple cells at once. Modern CPUs support SIMD instruction sets to different extents, i.e. one single CPU instruction can simultaneously process from two doubles (four floats) up to eight doubles (or sixteen floats), depending on the underlying architecture cebrian2020scalability. Vectorization can be easily applied to a parallel framework zhong2022using, which in our case results in the scheme shown in Figure 2.

Figure 1: Comparison between scalar and vectorized operations.
(a) Original mesh.
(b) Parallel partitioning of mesh cells.
(c) Vectorization of cell operations on each parallel unit.
Figure 2: Vectorization in a parallel framework: when the original mesh is partitioned among multiple computational units, vectorization is applied at the level of each process. Here, different colors refer to different parallel units and light/dark variations represent different vectorized batches (in this example, vectorization acts on 8 cells).

Vectorization is typically not used explicitly in matrix–based Finite Element codes as it is beneficial only in arithmetic intensive operations, whereas additional computational power becomes useless when the workload bottleneck is the memory bandwidth.

When vectorization is available, it can be more convenient to repeat the arithmetic operations to compute the action of the matrix on a known vector every time that it is needed, instead of accessing the matrix entries to compute a matrix–vector product. Moreover, thanks to the fact that the multivariate SEM Lagrange basis is of tensorial type, in order to reduce the computational complexity of one evaluation of the product , sum–factorization can be exploited Orszag_1980; Melenk_2001; Kronbichler_2019; in this way, the matrix–vector product in the generic three dimensional cell requires only floating point operations instead of per degree of freedom, resulting in a complexity equal to instead of , and this plays in favor of repeating the computation rather than of accessing the memory (see (Cantwell_2011, Sect. 2.3.1) and (chqz06, Sect. 4.5.1)).

On these bases, a high–order matrix–free solver is more efficient both in terms of memory occupation (no system matrix is assembled and stored globally) and computational time Kronbichler_2012, as we will also show in Sect. 5.

To precondition the CG method we have chosen Multigrid preconditioners. For the matrix–based solver, the Algebraic Multigrid (AMG) preconditioner janssen2011; Xu_2017 turns out to be a very efficient choice. Nevertheless, its implementation requires the explicit knowledge of the entries of the matrix . For this reason, this preconditioner cannot be used in a matrix–free context.

To overcome this drawback, in the latter case we have adopted a Geometric Multigrid (GMG) preconditioner, more precisely the high–order –multigrid preconditioner Sundar-2015, which uses –degree interpolation and restriction among geometrically coarsened meshes. GMG methods are among the most efficient solvers for linear systems arising from the discretization of elliptic partial differential equations, offering an optimal complexity in the number of unknowns , and they are often used as very efficient preconditioners (see kronbichler_multigrid; janssen2011; Trottenberg; Clevenger2021 and the literature cited therein).

GMG turns out to be very efficient in a matrix–free context because all its computational kernels, including the Chebyshev smoother and the transfer between different grid levels, are based on matrix–vector products involving suitable collections of mesh cells Adams_2003. In our implementation, the GMG preconditioner exploits the hierarchical meshes that are built by the recursive subdivision of each cell into 8 subcells, starting from a coarse mesh of size , as shown in Figure 3.

Figure 3: Schematization of multigrid methods. Starting from the real mesh , the action of the restriction and prolongation operators is shown, down to the coarse mesh .

5 Numerical results

We present several numerical simulations of cardiac electrophysiology. First, we consider a benchmark problem on a slab of cardiac tissue niederer2011, in order to compare SEM against SEM–NI and matrix–free against matrix–based in terms of computational efficiency and mathematical accuracy. Then, we employ the Zygote left ventricle geometry and we analyze the sole impact of increasing , i.e. the local polynomial degree, on the numerical solution. Finally, for the sake of completeness, we show the capability of our matrix–free solver by presenting a detailed Zygote four–chamber heart electrophysiological simulation in sinus rhythm.

For the time discretization, we use the BDF2 scheme with a time step . The final time differs with the specific test case. We employ in the Niederer benchmark niederer2011, while and are considered for the left ventricle and whole–heart geometries, respectively.

Regarding the linear algebra back–end, we use the PCG solver with a stopping criterion based on the absolute residual with tolerance . In the two test cases involving the slab and left ventricle, we employ the GMG (AMG, resp.) preconditioner for the matrix–free (matrix–based, resp.) solver. On the other hand, no preconditioner is introduced in the four–chamber heart numerical simulation, due to the presence of different ionic models in the computational domain, namely the CRN model (courtemanche1998) for atria and the TTP06 one (tentusscher_2006) for ventricles.

Fiber generation is performed by means of the Laplace–Dirichlet rule–based methods proposed in Piersanti_2021; Africa_2022, while in Table 1 we report the parameters of the monodomain equation.

Variable Value Unit Variable Value Unit
Conductivity tensor Applied current
Table 1: Parameters of the electrophysiological model. For the CRN and TTP06 ionic models, we adopt the parameters reported in the original papers tentusscher_2006; courtemanche1998 for epicardial cells.

The external current is applied for in a cuboid for the Niederer benchmark (as described in niederer2011), otherwise in different spheres for the ventricle and whole–heart test cases (we can deduce them from the numerical results shown in Figures 4, 10 and 12).

All the numerical simulations were performed by using one cluster node endowed with 56 cores (two Intel Xeon Gold 6238R, 2.20 GHz), which is available at MOX, Dipartimento di Matematica, Politecnico di Milano.

5.1 Slab of cardiac tissue

Figure 4: Niederer benchmark. The geometry and an example of mesh (left) and the associated simulated activation map (right). The blue color represents the region where the cubic stimulus is initially applied; the red one is associated with the corresponding diagonally opposite vertex, which is activated as last.

The computational domain with an example of mesh (left) and the associated numerical simulation (right) for the Niederer benchmark niederer2011 is depicted in Figure 4. An external stimulus of cubic shape is applied at one vertex, the electric signal propagates through the slab, and the diagonally opposite vertex is activated as the last point. The domain is discretized by means of a structured, uniform hexahedral mesh.

We present a systematic comparison between SEM and SEM–NI for several values of both the mesh size and the local polynomial degree , in order to understand which is the best formulation in terms of accuracy and computational cost. Moreover, we compare the efficiency of the matrix–free and matrix–based solvers for SEM.

In Figures 5 and 6 we show the action potential and the calcium concentration computed with SEM and SEM–NI, respectively, over time. More precisely, the minimum, average, maximum and point values are plotted, where the , , and functions are evaluated on the set of nodes of the mesh. We notice that the convergence is faster for increasing rather than for vanishing .

Figure 5: Niederer benchmark. Minimum (top), average (second), maximum (third) and point values (bottom) of the action potential and intracellular calcium concentration over time for a slab of cardiac tissue. is a random point within the computational domain away from the initial stimulus. We consider different combinations: SEM to and to ( is the average mesh size).
Figure 6: Niederer benchmark. Minimum (top), average (second), maximum (third) and point values (bottom) of the action potential and intracellular calcium concentration over time for a slab of cardiac tissue. is a random point within the computational domain away from the initial stimulus. We consider different combinations: SEM-NI to and to ( is the average mesh size).

At each node of the mesh, we also compute the activation time as the time instant when the approximation of the transmembrane potential presents maximum derivative, i.e.

(8)

In the formula above spans over the discrete set of time steps and the time derivative is approximated via the same scheme used for the time discretization of problem (1).

In Figure 7 we show the activation times along the slab diagonal, for different choices of the local space (from to ) and mesh refinements. As the error accumulates over the diagonal, the inset plots show a zoom around the right endpoint. Such results demonstrate that high polynomial degrees , even with a coarse mesh size , lead to a faster convergence rate compared to the small–, small– scenario.

(a) SEM.
(b) SEM–NI.
Figure 7: Niederer benchmark. Activation times computed along the diagonal of the slab (see Figure 4), with different choices of the local space ( to ) and mesh refinements ( to ).

To better investigate the comparison between SEM and SEM–NI, in Figure 8 we show the quantities

(9)
(10)
(11)
(12)

versus the total number of mesh points, for both SEM and SEM–NI solution . Our reference solution has been computed with FEM on a very fine grid with average mesh size , for a total of 11’401’089 mesh points. is a random point within the computational domain away from the initial stimulus. The , , and functions are evaluated on the set of nodes of the mesh. The number of mesh points increases by reducing for both “SEM ” and “SEM–NI ”, while it increases with for both “SEM ” and “SEM–NI ”.

The numerical results confirm the typical behaviour of SEM and SEM–NI discretizations, i.e. the errors decrease faster by increasing rather than decreasing . Moreover, we notice that SEM and SEM–NI errors behave quite similarly, with a slight advantage for SEM.

Figure 8: Niederer benchmark. Errors for action potential and intracellular calcium concentration versus the number of mesh points (dofs) used in a slab of cardiac tissue.

In Tables 25 we report the CPU time required by the linear solver for the whole numerical simulation (the times are cumulative over all time steps), for SEM and SEM–NI discretizations, matrix–free and matrix–based solvers. Furthermore, in Figure 9 we plot the errors (9)–(12) versus the CPU time required to solve all the linear systems along the whole numerical simulation. For SEM–NI we only report the times relative to the matrix–free solver, while for SEM we report the times for both the matrix–free and matrix–based solvers. The same symbol (circle, square, diamond, and cross) refers to the numerical simulations carried out on the meshes with the same number of DOFs. If we compare the errors and the CPU–times of SEM, with those of SEM, (these two configurations share the same number 1’449’665 of mesh nodes), we notice that the errors of SEM are at most about 1/3 – 1/2 of that of SEM and the ratio of the corresponding CPU times is about 40%. Thus, we conclude that SEM outperforms SEM (that is FEM).

Mesh points
number
Cells
number
Local
space
Linear solver
SEM []
Linear solver
SEM–NI []
Assemble rhs
SEM []
Assemble rhs
SEM–NI []
25’025 21’888 10.769 8.031 0.784 0.766
187’425 21’888 14.694 12.383 4.710 4.645
618’529 21’888 37.733 36.419 14.867 14.542
1’449’665 21’888 91.380 90.370 33.899 32.920
Table 2: Niederer benchmark. Computational times for SEM and SEM–NI, with a fixed average mesh size and ranging from 1 to 4. Matrix–free solver.
Mesh points
number
Cells
number
[]
Linear solver
SEM []
Linear solver
SEM–NI []
Assemble rhs
SEM []
Assemble rhs
SEM–NI []
25’025 21’888 0.48 10.769 8.031 0.784 0.766
187’425 175’104 0.24 29.157 27.951 5.771 5.656
1’449’665 1’400’832 0.12 256.295 270.959 42.783 43.548
11’401’089 11’206’656 0.06 2137.329 2158.751 336.272 336.641
Table 3: Niederer benchmark. Computational times for SEM and SEM–NI, with an average mesh size ranging from to and = 1. Matrix–free solver.
Mesh points
number
Cells
number
Local
space
Linear solver
matrix–free []
Assembly phase
matrix–free []
Linear solver
matrix–based []
Assembly phase
matrix–based []
25’025 21’888 10.769 0.784 5.570 17.150
187’425 21’888 14.694 4.710 47.655 176.375
618’529 21’888 37.733 14.867 556.041 919.298
1’449’665 21’888 91.380 33.899 2796.981 3598.244
Table 4: Niederer benchmark. Computational times for matrix–free and matrix–based solvers, SEM with a fixed average mesh size and ranging from 1 to 4.
Mesh points
number
Cells
number
[]
Linear solver
matrix–free []
Assembly phase
matrix–free []
Linear solver
matrix–based []
Assembly phase
matrix–based []
25’025 21’888 0.48 10.769 0.784 5.570 17.150
187’425 175’104 0.24 29.157 5.771 11.331 141.349
1’449’665 1’400’832 0.12 256.295 42.783 138.514 1089.666
11’401’089 11’206’656 0.06 2137.329 336.272 1328.839 8857.580
Table 5: Niederer benchmark. Computational times for matrix–free and matrix–based, SEM with an average mesh size ranging from to .
Figure 9: Niederer benchmark. Errors of action potential and intracellular calcium concentration versus CPU time required to solve all the linear systems in a slab of cardiac tissue. Equal symbols identify the same number of mesh points: 25’025, 187’425, 618’529, 1’449’665.

For the comparison between matrix–free and matrix–based solvers, we notice that the former one is always faster, and the gain of matrix–free over matrix–based solver increases with the polynomial degree . More precisely, the speed–up factors are shown in Table 6 when , and in Table 7 when .

Local space
Table 6: Niederer benchmark. Speed–up of the matrix–free solver over the matrix–based one when .
0.48 0.24 0.12 0.06
Table 7: Niederer benchmark. Speed–up of the matrix–free solver over the matrix–based one when .

Moreover, from Table 8 we observe that, in a matrix–based electrophysiological simulation, most of the computational time is spent to solve the linear system associated with the monodomain equation. On the contrary, in the matrix–free solver most of the computational time is devoted to the ionic model. This means that the cost for solving the linear system has been highly optimized.

Solver
Monodomain solver
Monodomain assembly
Ionic model solver
Matrix–based (, ) 16.61 % 56.98 % 26.41 %
Matrix–free (, ) 14.95 % 3.36 % 81.69 %
Table 8: Niederer benchmark. Matrix–free and matrix–based percentages for the assembling and solving phases. Note that in the matrix–free case we only need to assemble the right–hand side vector, as there is no matrix. Moreover, these percentages are computed without taking into account all other phases of the numerical simulation, such as mesh allocation, fiber generation and output of the results.

Finally, we compare the performance of the AMG and GMG preconditioners, used by the matrix–based and matrix–free solvers, respectively. In Tables 9 and 10 we show the average number of iterations required by the PCG method to solve the linear system (6). We notice that, for the values of and considered here, the GMG preconditioner appears optimal in the number of PCG iterations versus both and . As a matter of fact, the average number of iterations is about 1.8 for all configurations. On the contrary, the AMG preconditioner is optimal versus the mesh size (even if it requires a slightly larger number of iterations than GMG), while it is not against the polynomial degree . Indeed, in this case the number of iterations is proportional to the polynomial degree.

Mesh points
number
Cells
number
Local
space
Matrix–free (SEM)
GMG preconditioner
Matrix–free (SEM–NI)
GMG preconditioner
Matrix–based (SEM)
AMG preconditioner
25’025 21’888 1.6362 1.8126 2.5777
187’425 21’888 1.8056 1.8581 5.9255
618’529 21’888 1.7906 1.8161 8.3363
1’449’665 21’888 1.7371 1.7826 10.5187
Table 9: Niederer benchmark. Average number of CG iterations for matrix–free (SEM, SEM–NI) and matrix–based (SEM) solvers, with a fixed average mesh size and ranging from 1 to 4.
Mesh points
number
Cells
number
[]
Matrix–free (SEM)
GMG preconditioner
Matrix–free (SEM–NI)
GMG preconditioner
Matrix–based (SEM)
AMG preconditioner
25’025 21’888 0.48 1.6362 1.8126 2.5777
187’425 175’104 0.24 1.5757 1.6847 3.0535
1’449’665 1’400’832 0.12 1.4448 1.5937 2.9400
11’401’089 11’206’656 0.06 1.4468 1.4923 3.2094
Table 10: Niederer benchmark. Average number of CG iterations for matrix–free (SEM, SEM–NI) and matrix–based (SEM) solvers, with an average mesh size ranging from to .

All the numerical results shown in this section highlight how much advantageous the matrix–free solver with SEM or SEM–NI is for cardiac electrophysiology simulations, with respect to the matrix–based solver with low–order FEM.

Since the matrix–free implementation outperforms the matrix–based one, while SEM and SEM–NI provide comparable results in terms of accuracy and efficiency, we will employ the matrix–free solver with just the SEM formulation for the numerical simulations that we are going to present in the next sections.

5.2 Left ventricle

We report the results for the electrophysiological simulations performed with the Zygote left ventricle geometry. The settings of this test case are resumed at the beginning of Sect. 5. We consider a mesh with and polynomial degree from 1 to 4.

In Table 11 we report the number of mesh nodes, the number of cells and the average number of iterations required by the PCG method to solve the linear system (6). As for the Niederer benchmark, the GMG preconditioner turns out to be optimal also for these numerical simulations. Indeed, the number of PCG iterations is about 2 along the whole time history for any polynomial degree between 1 and 4.

Mesh points
number
Cells
number
Local
space
PCG iterations
GMG preconditioner
159’149 139’684 2.0770
1’172’919 139’684 1.9628
3’879’415 139’684 1.9455
9’116’741 139’684 1.9440
Table 11: Zygote left ventricle. Average number of PCG iterations for the matrix–free solver with SEM discretization, with a fixed average mesh size and ranging from 1 to 4.

In Figure 10 we depict the activation maps for different choices of the local space (from to ). By looking at the isolines, we observe that the solution is very close to the solution, that means we reach convergence for , even with such a relatively low mesh resolution . Whereas, it is a well–established result in the literature that first order Finite Elements would reach convergence for a value of that is about times smaller – i.e. for a much higher number of DOFs (see, e.g., Woodworth_2021). DOFs refer to the degrees of freedom of the action potential, disregarding both gating variables and ionic species, then it coincides with the number of mesh nodes.

The same conclusions hold when considering Figure 11, where we show the minimum, average, and maximum pointwise values of both the action potential and the intracellular calcium concentration over time.


()

()

()

()
Figure 10: Zygote left ventricle. Three views of the activation maps computed with elements () and a fixed average mesh size .
Figure 11: Zygote left ventricle. Minimum (top), average (mid), maximum (bottom) pointwise values of action potential and intracellular calcium concentration over time, with a fixed average mesh size and ranging from 1 to 4.

5.3 Whole–heart

The aim of this section is to show that our matrix–free solver can be successfully applied even in a complex framework. For this purpose we perform a numerical simulation in sinus rhythm with the Zygote four–chamber heart. The settings of this test case can be found at the beginning of Sect. 5. We consider different ionic models, namely CRN and TTP06 for atria and ventricles, respectively. Furthermore, we model the valvular rings as non-conductive regions of the myocardium. The mesh is endowed with 355’664 cells and 10’355’058 nodes (). We employ the matrix–free solver and SEM, this choice is motivated by the numerical results obtained for the Niederer benchmark (Sect. 5.1) and the convergence test performed on the Zygote left ventricle geometry (Sect. 5.2).

We depict in Figure 12 the evolution of the transmembrane potential over time on the whole–heart geometry. The electric signal initiates at the sinoatrial node in the right atrium and then propagates to the left atrium and ventricles by means of preferential conduction lines, such as the Bachmann’s and His bundles Harrington_2011. The wavefront propagation appears very smooth, while accounting for small dissipation and dispersion throughout the heartbeat Bucelli_2021.

t=
t=
t=
t=
t=
t=
t=
t=
Figure 12: Zygote whole–heart. Time evolution of the transmembrane potential during one heartbeat.

6 Conclusions

We developed a matrix–free solver for cardiac electrophysiology tailored to the efficient use of high–order numerical methods. We employed the monodomain equation to model the propagation of the transmembrane potential and physiologically–based ionic models (CRN and TTP06) to describe the behaviour of different chemical species at the cells level.

We run several electrophysiological simulations for three different test cases, namely a slab of cardiac tissue, the Zygote left ventricle and the Zygote whole–heart to demonstrate the effectiveness and generality of our matrix–free solver in combination with Spectral Element Methods. SEM and SEM–NI provided comparable numerical results in terms of both accuracy and efficiency. Furthermore, we showed the importance of considering high–order Finite Elements for this class of mathematical problems, i.e. when sharp wavefronts are involved.

Our matrix–free solver outperforms state–of–the–art matrix–based solvers in terms of computational costs and memory requirements. This is true even when matrix–vector products are computed without any matrix–free preconditioner, thanks to both vectorization and sum–factorization. Finally, the small memory usage of the matrix–free implementation may allow for the development of GPU–based solvers of the cardiac function.

Acknowledgments

This research has been funded partly by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 740132, iHEART “An Integrated Heart Model for the simulation of the cardiac function”, P.I. Prof. A. Quarteroni) and partly by the Italian Ministry of University and Research (MIUR) within the PRIN (Research projects of relevant national interest 2017 “Modeling the heart across the scales: from cardiac cells to the whole organ” Grant Registration number 2017AXL54F).

Authors’ contribution

All authors contributed to the study conception and design of the work. PCA and MS developed the computer code, performed the numerical simulations and post-processed the numerical results. PG conceptualized the goals and the theoretical background of the research and took the lead in designing the experiments and interpreting the results. LD contributed to the analysis of results and coordinated the research activity planning. The initial draft of this manuscript was written by PCA, MS and PG. All authors read and approved this manuscript.

References