Development and performance of a HemeLB GPU code for human-scale blood flow simulation

01/13/2022
by   I. Zacharoudiou, et al.
0

In recent years, it has become increasingly common for high performance computers (HPC) to possess some level of heterogeneous architecture - typically in the form of GPU accelerators. In some machines these are isolated within a dedicated partition, whilst in others they are integral to all compute nodes - often with multiple GPUs per node - and provide the majority of a machine's compute performance. In light of this trend, it is becoming essential that codes deployed on HPC are updated to execute on accelerator hardware. In this paper we introduce a GPU implementation of the 3D blood flow simulation code HemeLB that has been developed using CUDA C++. We demonstrate how taking advantage of NVIDIA GPU hardware can achieve significant performance improvements compared to the equivalent CPU only code on which it has been built whilst retaining the excellent strong scaling characteristics that have been repeatedly demonstrated by the CPU version. With HPC positioned on the brink of the exascale era, we use HemeLB as a motivation to provide a discussion on some of the challenges that many users will face when deploying their own applications on upcoming exascale machines.

READ FULL TEXT
research
09/08/2021

Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems

We present new results on the strong parallel scaling for the OpenACC-ac...
research
08/10/2020

sputniPIC: an Implicit Particle-in-Cell Code for Multi-GPU Systems

Large-scale simulations of plasmas are essential for advancing our under...
research
08/23/2022

Not All GPUs Are Created Equal: Characterizing Variability in Large-Scale, Accelerator-Rich Systems

Scientists are increasingly exploring and utilizing the massive parallel...
research
12/09/2021

Is Disaggregation possible for HPC Cognitive Simulation?

Cognitive simulation (CogSim) is an important and emerging workflow for ...
research
08/08/2019

From Piz Daint to the Stars: Simulation of Stellar Mergers using High-Level Abstractions

We study the simulation of stellar mergers, which requires complex simul...
research
08/26/2020

8 Steps to 3.7 TFLOP/s on NVIDIA V100 GPU: Roofline Analysis and Other Tricks

Performance optimization can be a daunting task especially as the hardwa...
research
03/14/2019

More Bang for Your Buck: Improved use of GPU Nodes for GROMACS 2018

We identify hardware that is optimal to produce molecular dynamics traje...

Please sign up or login with your details

Forgot password? Click here to reset