Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementations

by   Martin Kronbichler, et al.

This work investigates a variant of the conjugate gradient (CG) method and embeds it into the context of high-order finite-element schemes with fast matrix-free operator evaluation and cheap preconditioners like the matrix diagonal. Relying on a data-dependency analysis and appropriate enumeration of degrees of freedom, we interleave the vector updates and inner products in a CG iteration with the matrix-vector product with only minor organizational overhead. As a result, around 90 vectors of the CG method are transferred from slow RAM memory exactly once per iteration, with all additional access hitting fast cache memory. Node-level performance analyses and scaling studies on up to 147k cores show that the CG method with the proposed performance optimizations is around two times faster than a standard CG solver as well as optimized pipelined CG and s-step CG methods for large sizes that exceed processor caches, and provides similar performance near the strong scaling limit.



page 1

page 2

page 3

page 4


Algorithms and data structures for matrix-free finite element operators with MPI-parallel sparse multi-vectors

Traditional solution approaches for problems in quantum mechanics scale ...

A stencil scaling approach for accelerating matrix-free finite element implementations

We present a novel approach to fast on-the-fly low order finite element ...

hyper.deal: An efficient, matrix-free finite-element library for high-dimensional partial differential equations

This work presents the efficient, matrix-free finite-element library hyp...

Fast Barycentric-Based Evaluation Over Spectral/hp Elements

As the use of spectral/hp element methods, and high-order finite element...

Linearizing the hybridizable discontinuous Galerkin method: A linearly scaling operator

This paper proposes a matrix-free residual evaluation technique for the ...

Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation using OpenACC

In this paper, we develop a low-order three-dimensional finite-element s...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.