Implementing Performance Portability of High Performance Computing Programs in the New Golden Age of Chip Architecture

08/26/2023
by   Weifeng Liu, et al.
0

As an important goal of high-performance computing, the concept of performance portability has been around for many years. As the failure of Moore's Law, it is no longer feasible to improve computer performance by simply increasing the number of existing hardware. The innovation of high performance computer is imperative, which makes high-performance computers with multiple architectures coexist in the production environment. For example, current high-performance computing nodes often use co-accelerators such like general-purpose GPUs and Intel Xeon Phis to accelerate general-purpose processors. With the flourishing of deep learning, dedicated neural network acceleration chips are also arising. The emergence of co-accelerators with different architectures and their wide application in high-performance computers have challenged the performance portability of programs between high-performance computers with different architectures. This article summarizes the current performance portability technology from the programming model, serial code automatic parallelization, parallel code automatic conversion, etc. at the end of the article, it also summarizes how to use scientific computing function libraries to improve performance and performance portability of a program. Different application scenarios need different implementation technologies to get performance portability. Program developers choose performance portability solutions for their programs. In fact, they balance programming efficiency and optimization effects under various constraints.

READ FULL TEXT

page 3

page 4

page 14

research
03/18/2021

Porting a sparse linear algebra math library to Intel GPUs

With the announcement that the Aurora Supercomputer will be composed of ...
research
04/10/2019

High Performance Reconfigurable Computing Systems

The rapid progress and advancement in electronic chips technology provid...
research
12/03/2019

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Deep learning frameworks have often focused on either usability or speed...
research
02/27/2019

Stateful Dataflow Multigraphs: A Data-Centric Model for High-Performance Parallel Programs

With the ubiquity of accelerators, such as FPGAs and GPUs, the complexit...
research
09/24/2020

Neurocoder: Learning General-Purpose Computation Using Stored Neural Programs

Artificial Neural Networks are uniquely adroit at machine learning by pr...
research
04/10/2019

Application performance on a Cluster-Booster system

The DEEP projects have developed a variety of hardware and software tech...
research
02/27/2019

Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous Architectures

The ubiquity of accelerators in high-performance computing has driven pr...

Please sign up or login with your details

Forgot password? Click here to reset