On the performance of a highly-scalable Computational Fluid Dynamics code on AMD, ARM and Intel processors

10/12/2020
by   Pablo Ouro, et al.
0

No area of computing is hungrier for performance than High Performance Computing (HPC), the demands of which continue to be a major driver for processor performance and adoption of accelerators, and also advances in memory, storage, and networking technologies. A key feature of the Intel processor domination of the past decade has been the extensive adoption of GPUs as coprocessors, whilst more recent developments have seen the increased availability of a number of CPU processors, including the novel ARM-based chips. This paper analyses the performance and scalability of a state-of-the-art Computational Fluid Dynamics (CFD) code on three HPC cluster systems equipped with AMD EPYC-Rome (EPYC, 4096 cores), ARM-based Marvell ThunderX2 (TX2, 8192 cores) and Intel Skylake (SKL, 8000 cores) processors. Three benchmark cases are designed with increasing computation-to-communication ratio and numerical complexity, namely lid-driven cavity flow, Taylor-Green vortex and a travelling solitary wave using the level-set method, adopted with 4^th-order central-differences or a 5^th-order WENO scheme. Our results show that the EPYC cluster delivers the best code performance for all the setups under consideration. In the first two benchmarks, the SKL cluster demonstrates faster computing times than the TX2 system, whilst in the solitary wave simulations, the TX2 cluster achieves good scalability and similar performance to the EPYC system, both improving on that obtained with the SKL cluster. These results suggest that while the Intel SKL cores deliver the best strong scalability, the associated cluster performance is lower compared to the EPYC system. The TX2 cluster performance is promising considering its recent addition to the HPC portfolio.

READ FULL TEXT

page 17

page 25

page 29

research
07/09/2020

Performance and energy consumption of HPC workloads on a cluster based on Arm ThunderX2 CPU

In this paper, we analyze the performance and energy consumption of an A...
research
08/09/2019

Performance of Devito on HPC-Optimised ARM Processors

We evaluate the performance of Devito, a domain specific language (DSL) ...
research
08/09/2019

Performance of Devito on HPC-Optimised ARM Processo

We evaluate the performance of Devito, a domain specific language (DSL) ...
research
10/23/2020

Performance Evaluation of ParalleX Execution model on Arm-based Platforms

The HPC community shows a keen interest in creating diversity in the CPU...
research
04/12/2022

"Smarter" NICs for faster molecular dynamics: a case study

This work evaluates the benefits of using a "smart" network interface ca...
research
10/10/2017

SoAx: A generic C++ Structure of Arrays for handling Particles in HPC Codes

The numerical study of physical problems often require integrating the d...
research
07/16/2019

Coprocessors: failures and successes

The appearance and disappearance of coprocessors by integration into the...

Please sign up or login with your details

Forgot password? Click here to reset