Towards Accelerating High-Order Stencils on Modern GPUs and Emerging Architectures with a Portable Framework

09/09/2023
by   Ryuichi Sai, et al.
0

PDE discretization schemes yielding stencil-like computing patterns are commonly used for seismic modeling, weather forecast, and other scientific applications. Achieving HPC-level stencil computations on one architecture is challenging, porting to other architectures without sacrificing performance requires significant effort, especially in this golden age of many distinctive architectures. To help developers achieve performance, portability, and productivity with stencil computations, we developed StencilPy. With StencilPy, developers write stencil computations in a high-level domain-specific language, which promotes productivity, while its backends generate efficient code for existing and emerging architectures, including NVIDIA, AMD, and Intel GPUs, A64FX, and STX. StencilPy demonstrates promising performance results on par with hand-written code, maintains cross-architectural performance portability, and enhances productivity. Its modular design enables easy configuration, customization, and extension. A 25-point star-shaped stencil written in StencilPy is one-quarter of the length of a hand-crafted CUDA code and achieves similar performance on an NVIDIA H100 GPU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2020

Accelerating High-Order Stencils on GPUs

Stencil computations are widely used in HPC applications. Today, many HP...
research
01/31/2018

Cataloging the Visible Universe through Bayesian Inference at Petascale

Astronomical catalogs derived from wide-field imaging surveys are an imp...
research
03/10/2023

Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes

We explore the performance and portability of the high-level programming...
research
04/10/2021

Application specific dataflow machine construction for programming FPGAs via Lucent

Field Programmable Gate Arrays (FPGAs) have the potential to accelerate ...
research
12/19/2017

Accelerating the computation of FLAPW methods on heterogeneous architectures

Legacy codes in computational science and engineering have been very suc...
research
12/02/2019

GPU Support for Automatic Generation of Finite-Differences Stencil Kernels

The growth of data to be processed in the Oil Gas industry matches t...

Please sign up or login with your details

Forgot password? Click here to reset