Portable high-order finite element kernels I: Streaming Operations

09/23/2020
by   Noel Chalmers, et al.
0

This paper is devoted to the development of highly efficient kernels performing vector operations relevant in linear system solvers. In particular, we focus on the low arithmetic intensity operations (i.e., streaming operations) performed within the conjugate gradient iterative method, using the parameters specified in the CEED benchmark problems for high-order hexahedral finite elements. We propose a suite of new Benchmark Streaming tests to focus on the distinct streaming operations which must be performed. We implemented these new tests using the OCCA abstraction framework to demonstrate portability of these streaming operations on different GPU architectures, and propose a simple performance model for such kernels which can accurately capture data movement rates as well as kernel launch costs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2023

Performant low-order matrix-free finite element kernels on GPU architectures

Numerical methods such as the Finite Element Method (FEM) have been succ...
research
11/02/2017

Acceleration of tensor-product operations for high-order finite element methods

This paper is devoted to GPU kernel optimization and performance analysi...
research
08/03/2021

Local Fourier Analysis of P-Multigrid for High-Order Finite Element Operators

Multigrid methods are popular for solving linear systems derived from di...
research
09/10/2021

Efficient Exascale Discretizations: High-Order Finite Element Methods

Efficient exploitation of exascale architectures requires rethinking of ...
research
09/10/2021

GPU Algorithms for Efficient Exascale Discretizations

In this paper we describe the research and development activities in the...
research
10/16/2021

Tesselating a Pascal-like tetrahedron for the subdivision of high order tetrahedral finite elements

Three-dimensional N^th order nodal Lagrangian tetrahedral finite element...
research
09/27/2019

SUNDIALS Multiphysics+MPIManyVector Performance Testing

In this report we document performance test results on a SUNDIALS-based ...

Please sign up or login with your details

Forgot password? Click here to reset