Fast matrix-free evaluation of discontinuous Galerkin finite element operators

by   Martin Kronbichler, et al.

We present an algorithmic framework for matrix-free evaluation of discontinuous Galerkin finite element operators based on sum factorization on quadrilateral and hexahedral meshes. We identify a set of kernels for fast quadrature on cells and faces targeting a wide class of weak forms originating from linear and nonlinear partial differential equations. Different algorithms and data structures for the implementation of operator evaluation are compared in an in-depth performance analysis. The sum factorization kernels are optimized by vectorization over several cells and faces and an even-odd decomposition of the one-dimensional compute kernels. In isolation our implementation then reaches up to 60% of arithmetic peak on Intel Haswell and Broadwell processors and up to 50% of arithmetic peak on Intel Knights Landing. The full operator evaluation reaches only about half that throughput due to memory bandwidth limitations from loading the input and output vectors, MPI ghost exchange, as well as handling variable coefficients and the geometry. Our performance analysis shows that the results are often within 10% of the available memory bandwidth for the proposed implementation, with the exception of the Cartesian mesh case where the cost of gather operations and MPI communication are more substantial.


page 1

page 2

page 3

page 4


High-performance Implementation of Matrix-free High-order Discontinuous Galerkin Methods

Achieving a substantial part of peak performance on todays and future hi...
02/19/2020 An efficient, matrix-free finite-element library for high-dimensional partial differential equations

This work presents the efficient, matrix-free finite-element library hyp...

Enclave Tasking for Discontinuous Galerkin Methods on Dynamically Adaptive Meshes

High-order Discontinuous Galerkin (DG) methods promise to be an excellen...

Algorithms and data structures for matrix-free finite element operators with MPI-parallel sparse multi-vectors

Traditional solution approaches for problems in quantum mechanics scale ...

Efficient Explicit Time Stepping of High Order Discontinuous Galerkin Schemes for Waves

This work presents algorithms for the efficient implementation of discon...

A Hermite-like basis for faster matrix-free evaluation of interior penalty discontinuous Galerkin operators

This work proposes a basis for improved throughput of matrix-free evalua...

Model-Based Performance Analysis of the HyTeG Finite Element Framework

In this work, we present how code generation techniques significantly im...

Please sign up or login with your details

Forgot password? Click here to reset