Applying the swept rule for explicit partial differential equation solutions on heterogeneous computing systems

11/14/2018
by   Daniel J Magee, et al.
0

Applications that exploit the architectural details of high performance computing (HPC) systems have become increasingly invaluable in academia and industry over the past two decades. The most important hardware development of the last decade in HPC has been the General Purpose Graphics Processing Unit (GPGPU), a class of massively parallel devices that now contributes the majority of computational power in the top 500 supercomputers. As these systems grow small costs such as latency—the fixed cost of memory accesses—accumulate over the numerous iterations in a large simulation and become a significant barrier to performance. The swept time-space decomposition rule is a communication-avoiding technique for time-stepping stencil update formulas that attempts to sidestep a portion of the latency costs. This work extends the swept rule by targeting heterogeneous, CPU/GPU architectures representative of current and future HPC systems. We compare our approach to a naive decomposition scheme with two test equations using an MPI+CUDA pattern on 40 processes over two nodes containing one GPU. We show that the swept rule produces a 4–18x speedup with the heat equation and a 1.5-3x speedup with the Euler equations using the same processors and work distribution. These results demonstrate the potential effectiveness of the swept rule for different equations and numerical schemes on massively parallel compute systems that incur substantial latency costs.

READ FULL TEXT

page 13

page 14

research
11/14/2018

Applying the swept rule for solving explicit partial differential equations on heterogeneous computing systems

Applications that exploit the architectural details of high-performance ...
research
04/01/2021

The Two-Dimensional Swept Rule Applied on Heterogeneous Architectures

The partial differential equations describing compressible fluid flows c...
research
03/20/2018

Crossing the Architectural Barrier: Evaluating Representative Regions of Parallel HPC Applications

Exascale computing will get mankind closer to solving important social, ...
research
12/08/2016

An initial investigation of the performance of GPU-based swept time-space decomposition

Simulations of physical phenomena are essential to the expedient design ...
research
06/30/2020

Hierarchical Jacobi Iteration for Structured Matrices on GPUs using Shared Memory

High fidelity scientific simulations modeling physical phenomena typical...
research
05/09/2017

Accelerating solutions of one-dimensional unsteady PDEs with GPU-based swept time-space decomposition

The expedient design of precision components in aerospace and other high...
research
01/17/2019

High performance scheduling of mixed-mode DAGs on heterogeneous multicores

Many HPC applications can be expressed as mixed-mode computations, in wh...

Please sign up or login with your details

Forgot password? Click here to reset