Development of an Equation-based Parallelization Method for Multiphase Particle-in-Cell Simulations

11/28/2022
by   Mino Woo, et al.
0

Manufacturers have been developing new graphics processing unit (GPU) nodes with large capacity, high bandwidth memory and very high bandwidth intra-node interconnects. This enables moving large amounts of data between GPUs on the same node at low cost. However, small packet bandwidths and latencies have not decreased which makes global dot products expensive. These characteristics favor a new kind of problem decomposition called "equation decomposition" rather than traditional domain decomposition. In this approach, each GPU is assigned one equation set to solve in parallel so that the frequent and expensive dot product synchronization points in traditional distributed linear solvers are eliminated. In exchange, the method involves infrequent movement of state variables over the high bandwidth, intra-node interconnects. To test this theory, our flagship code Multiphase Flow with Interphase eXchanges (MFiX) was ported to TensorFlow. This new product is known as MFiX-AI and can produce near identical results to the original version of MFiX with significant acceleration in multiphase particle-in-cell (MP-PIC) simulations. The performance of a single node with 4 NVIDIA A100s connected over NVLINK 2.0 was shown to be competitive to 1000 CPU cores (25 nodes) on the JOULE 2.0 supercomputer, leading to an energy savings of up to 90 benefit for small- to intermediate-sized problems. This benefit is expected to grow as GPU nodes become more powerful. Further, MFiX-AI is poised to accept native artificial intelligence/machine learning models for further acceleration and development.

READ FULL TEXT

page 14

page 16

page 19

research
04/07/2019

Multi-GPU Acceleration of the iPIC3D Implicit Particle-in-Cell Code

iPIC3D is a widely used massively parallel Particle-in-Cell code for the...
research
08/10/2020

sputniPIC: an Implicit Particle-in-Cell Code for Multi-GPU Systems

Large-scale simulations of plasmas are essential for advancing our under...
research
02/28/2023

Interconnect Bandwidth Heterogeneity on AMD MI250x and Infinity Fabric

Demand for low-latency and high-bandwidth data transfer between GPUs has...
research
05/19/2022

Comparing single-node and multi-node performance of an important fusion HPC code benchmark

Fusion simulations have traditionally required the use of leadership sca...
research
05/05/2022

ChASE – A Distributed Hybrid CPU-GPU Eigensolver for Large-scale Hermitian Eigenvalue Problems

As modern massively parallel clusters are getting larger with beefier co...
research
03/26/2021

Porting HEP Parameterized Calorimeter Simulation Code to GPUs

The High Energy Physics (HEP) experiments, such as those at the Large Ha...
research
06/28/2023

Leveraging HPC Profiling Tracing Tools to Understand the Performance of Particle-in-Cell Monte Carlo Simulations

Large-scale plasma simulations are critical for designing and developing...

Please sign up or login with your details

Forgot password? Click here to reset