An Improved Framework of GPU Computing for CFD Applications on Structured Grids using OpenACC

12/05/2020
by   Weicheng Xue, et al.
0

This paper is focused on improving multi-GPU performance of a research CFD code on structured grids. MPI and OpenACC directives are used to scale the code up to 16 GPUs. This paper shows that using 16 P100 GPUs and 16 V100 GPUs can be 30× and 70× faster than 16 Xeon CPU E5-2680v4 cores for three different test cases, respectively. A series of performance issues related to the scaling for the multi-block CFD code are addressed by applying various optimizations. Performance optimizations such as the pack/unpack message method, removing temporary arrays as arguments to procedure calls, allocating global memory for limiters and connected boundary data, reordering non-blocking MPI I_send/I_recv and Wait calls, reducing unnecessary implicit derived type member data movement between the host and the device and the use of GPUDirect can improve the compute utilization, memory throughput, and asynchronous progression in the multi-block CFD code using modern programming features.

READ FULL TEXT

page 6

page 8

page 9

page 24

page 34

research
06/04/2020

Multi-GPU Performance Optimization of a CFD Code using OpenACC on Different Platforms

This paper investigates the multi-GPU performance of a 3D buoyancy drive...
research
02/24/2022

Parthenon – a performance portable block-structured adaptive mesh refinement framework

On the path to exascale the landscape of computer device architectures a...
research
01/21/2021

Efficient MPI-based Communication for GPU-Accelerated Dask Applications

Dask is a popular parallel and distributed computing framework, which ri...
research
09/04/2017

From MPI to MPI+OpenACC: Conversion of a legacy FORTRAN PCG solver for the spherical Laplace equation

A real-world example of adding OpenACC to a legacy MPI FORTRAN Precondit...
research
06/09/2021

Benchmarking the Nvidia GPU Lineage: From Early K80 to Modern A100 with Asynchronous Memory Transfers

For many, Graphics Processing Units (GPUs) provides a source of reliable...
research
11/28/2017

Implementing implicit OpenMP data sharing on GPUs

OpenMP is a shared memory programming model which supports the offloadin...
research
03/24/2020

Gadget3 on GPUs with OpenACC

We present preliminary results of a GPU porting of all main Gadget3 modu...

Please sign up or login with your details

Forgot password? Click here to reset