Direct Solution of FEM Models: Are Sparse Direct Solvers the Best Strategy?

by   Javad Moshfegh, et al.

A brief summary of direct solution approaches for finite element methods (FEM) in computational electromagnetics (CEM) is given along with an alternative direct solution based on domain decomposition (DD). Unlike recent trends in approximate/low-rank solvers, this work focuses on `numerically exact' solution methods as they are more reliable for complex `real-life' models. Preliminary studies on general three dimensional geometries with unstructured FEM meshes suggest that the proposed direct DD methodology offers significant memory advantages over highly optimized, high-performance sparse direct solver libraries, while maintaining approximately comparable or slightly slower serial serial execution speed but with significantly better parallel and GPU processing prospects.



There are no comments yet.


page 2


Direct Domain Decomposition Method (D3M) for Finite Element Electromagnetic Computations

An exact arithmetic, memory efficient direct solution method for finite ...

A distributed-memory hierarchical solver for general sparse linear systems

We present a parallel hierarchical solver for general sparse linear syst...

Parallel Direct Domain Decomposition Methods (D3M) for Finite Elements

A parallel direct solution approach based on domain decomposition method...

Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards

We discuss an approach for solving sparse or dense banded linear systems...

State-of-The-Art Sparse Direct Solvers

In this chapter we will give an insight into modern sparse elimination m...

Approximating inverse FEM matrices on non-uniform meshes with H-matrice

We consider the approximation of the inverse of the finite element stiff...

High-performance sampling of generic Determinantal Point Processes

Determinantal Point Processes (DPPs) were introduced by Macchi as a mode...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Finite Element Method (FEM) computational electromagnetic models result in very large but sparse linear systems that are often solved iteratively in lieu of direct factorization methods, [direct-methods-for-sparse], that use significantly more memory and have higher asymptotic computational complexity. Yet, when these models involve many independent excitations, or complex materials, or multi-scale or near resonance structures, even the most advanced preconditioned iterative methods can lose effectiveness and efficiency. This problematic behavior, coupled with the recent proliferation of computing cores and RAM memory has lead to a renewed interest in sparse matrix direct solution methods in certain FEM computations in CEM. For all other situations, sparse direct solvers remain indispensable, yet as a ‘hidden’ component of preconditioned iterative methods e.g. corse-grid correction in multi-grid solvers.

Efficient implementations of classic sparse Cholesky or LDL factorizations for symmetric systems, or sparse LU methods, make use of various data and temporal locality improving computational schemes such as the multi-frontal [MUMPS1] or the left/right looking super-nodal [Schenk2004] algorithms to reduce cache misses and significantly improve speed. When coupled with advanced fill-in reducing matrix reordering schemes such as the multilevel nested dissection [METIS] they can result in powerful solvers with impressive performance.

This paper will explore weather the conventional approach of using highly optimized high-performance sparse direct solver libraries is indeed the best available option for the direct solution of FEM problems?

Ii Approach

In our attempt to answer this question, an alternative, domain decomposition (DD) based, direct solution strategy (with numerical exact arithmetic

) for FEM models will be outlined and compared to state-of-the-art sparse matrix direct solvers. Unlike conventional approaches that are agnostic of the matrix physical model origins i.e. ‘black-box’ solvers, this direct DD strategy relies on a judicious re-formulation of the boundary value problem (BVP) as a decomposed BVP on a collection of domains. It is noted that the method is not restricted to Maxwell equations or electromagnetics, but is simply presented in the CEM context. The decomposed BVP and its subsequent mixed/hybrid variational formulation must be cast such that all domains are always regular while maintaining the self-adjoint nature of the overall problem. This will prove critical in achieving an efficient direct solution. Subsequent discretization with suitably chosen FEM basis in terms of primal degrees-of-freedom (DoFs) in domain volumes and dual DoF on domain interfaces leads to a large sparse arrowhead matrix that is then reduced in size by eliminating via many independent sparse direct factorizations the primal DoFs. This process is akin to the FETI reduction in iterative DDs, but with key appropriate modifications that will later facilitate a very efficient direct factorization. Formulation of the proposed approach can be found in

[moshfegh2016direct, moshfegh2017memory].

The resultant auxiliary matrix is symmetric but orders of magnitude smaller than the original one, and is block-wise sparse (sparse collection of dense blocks), something computational very desirable in direct matrix factorizations because the high degree of data locality enables the use the blazing fast level 3 BLAS [DONGARRA1990]. As such, this matrix is LDL factorized via a very fast symbolic factorization step that is attributed clique graph of the block-wise sparse matrix, followed by a custom block LDL with restricted Bunch-Kaufman pivoting [BUNCH1977]. Once the the auxiliary problem has been efficiently factorized, the solution of all or a subset of primal unknowns are recovered in an embarrassingly parallel manner via many smaller size sparse forward backward substitutions.

A conceptual overview of the proposed approach and its conventional counterpart are depicted in Fig. 1. Although the proposed approach involves more computational steps, much like field computations with auxiliary potentials in electromagnetics, solving the auxiliary problem is considerably easier than the conventional one-shot approach, but also attaining the auxiliary problem and recovering the full solution from it are relatively fast and embarrassingly parallel. In a sense out approach instead of striving to minimize fill-in to achieve better memory and speed, it attempts to maximize the concurrency in sparse matrix computations, as well as the total amount and order of subsequent dense matrix operations, thus promising very favorable parallel and GPU processing prospects.

Fig. 1: Conceptual overview and comparison of conventional direct solutions of FEM models vs. the proposed direct solution approach.

Iii Comparisons

Two low to moderate complexity unstructured FEM models are used as test cases to compare the proposed method with MUMPS 5.0.2 [MUMPS1], and PARDISO-MKL11.1 [MKL_PARDISO] state-of-the-art ‘black-box’ direct sparse matrix solvers with MeTiS 5.1.0 reordering for all methods. The first model involves a dielectric sphere of progressively larger size, (domain increases in three dimensions), and a monopole antenna array problem again of progressively increasing size but also port number (excitations). The memory, factorization time and forward/backward substitution times of those test cases are presented in Figs 2 and 3. In both cases, the purposed direct DD method shows a clear edge in memory requirements, which is a major improvement on a key direct method weakness . Both factorization and forward backward substitution are on par for all methods with the proposed method being slightly slower.

Fig. 2: Factorization memory and time comparison between the proposed direct DD approach and state-of-the-art direct solvers on progressively electrically larger dielectric spheres. (a) Memory; (b) CPU time (serial run); (c) forward/backward substitution time for 200 excitations (serial run).
Fig. 3: Factorization memory and time comparison between the proposed direct DD approach and state-of-the-art direct solvers on progressively larger finite monopole antenna arrays. (a) Memory; (b) factorization time (serial run); (c) forward/backward substitution time (serial run).

Iv Conclusions

For the moderate size FEM problems examined here it was found that when memory is at premium, ‘black-box’ direct solvers are not the best direct solution approach, the proposed direct DD method showed clear benefits as its memory appears to be comparable to preconditioned iterative solvers. When speed is important the highly optimized ‘back-box’ libraries maintain a slight edge over the proposed direct DD and possibly iterative solvers. It is believe that direct DD solvers will have very favorable parallel and GPU processing prospects.