An initial investigation of the performance of GPU-based swept time-space decomposition

12/08/2016
by   Daniel Magee, et al.
0

Simulations of physical phenomena are essential to the expedient design of precision components in aerospace and other high-tech industries. These phenomena are often described by mathematical models involving partial differential equations (PDEs) without exact solutions. Modern design problems require simulations with a level of resolution that is difficult to achieve in a reasonable amount of time even in effectively parallelized solvers. Though the scale of the problem relative to available computing power is the greatest impediment to accelerating these applications, significant performance gains can be achieved through careful attention to the details of memory accesses. Parallelized PDE solvers are subject to a trade-off in memory management: store the solution for each timestep in abundant, global memory with high access costs or in a limited, private memory with low access costs that must be passed between nodes. The GPU implementation of swept time-space decomposition presented here mitigates this dilemma by using private (shared) memory, avoiding internode communication, and overwriting unnecessary values. It shows significant improvement in the execution time of the PDE solvers in one dimension achieving speedups of 6-2x for large and small problem sizes respectively compared to naive GPU versions and 7-300x compared to parallel CPU versions.

READ FULL TEXT

page 6

page 7

research
05/09/2017

Accelerating solutions of one-dimensional unsteady PDEs with GPU-based swept time-space decomposition

The expedient design of precision components in aerospace and other high...
research
11/14/2018

Applying the swept rule for solving explicit partial differential equations on heterogeneous computing systems

Applications that exploit the architectural details of high-performance ...
research
11/14/2018

Applying the swept rule for explicit partial differential equation solutions on heterogeneous computing systems

Applications that exploit the architectural details of high performance ...
research
06/30/2020

Hierarchical Jacobi Iteration for Structured Matrices on GPUs using Shared Memory

High fidelity scientific simulations modeling physical phenomena typical...
research
10/07/2020

Fast Stencil-Code Computation on a Wafer-Scale Processor

The performance of CPU-based and GPU-based systems is often low for PDE ...
research
07/01/2019

Lossy Compression for Large Scale PDE Problems

Solvers for partial differential equations (PDE) are one of the cornerst...
research
04/14/2020

Scalability of High-Performance PDE Solvers

Performance tests and analyses are critical to effective HPC software de...

Please sign up or login with your details

Forgot password? Click here to reset