Evaluating the performance portability of SYCL across CPUs and GPUs on bandwidth-bound applications

09/18/2023
by   Istvan Z Reguly, et al.
0

In this paper, we evaluate the portability of the SYCL programming model on some of the latest CPUs and GPUs from a wide range of vendors, utilizing the two main compilers: DPC++ and hipSYCL/OpenSYCL. Both compilers currently support GPUs from all three major vendors; we evaluate performance on the Intel(R) Data Center GPU Max 1100, the NVIDIA A100 GPU, and the AMD MI250X GPU. Support on CPUs currently is less established, with DPC++ only supporting x86 CPUs through OpenCL, however, OpenSYCL does have an OpenMP backend capable of targeting all modern CPUs; we benchmark the Intel Xeon Platinum 8360Y Processor (Ice Lake), the AMD EPYC 9V33X (Genoa-X), and the Ampere Altra platforms. We study a range of primarily bandwidth-bound applications implemented using the OPS and OP2 DSLs, evaluate different formulations in SYCL, and contrast their performance to "native" programming approaches where available (CUDA/HIP/OpenMP). On GPU architectures SCYL on average even slightly outperforms native approaches, while on CPUs it falls behind - highlighting a continued need for improving CPU performance. While SYCL does not solve all the challenges of performance portability (e.g. needing different algorithms on different hardware), it does provide a single programming model and ecosystem to target most current HPC architectures productively.

READ FULL TEXT

page 4

page 5

page 6

page 7

page 8

page 9

research
09/11/2023

Many Cores, Many Models: GPU Programming Model vs. Vendor Compatibility Overview

In recent history, GPUs became a key driver of compute performance in HP...
research
04/20/2018

CUDA Support in GNA Data Analysis Framework

Usage of GPUs as co-processors is a well-established approach to acceler...
research
07/01/2016

Using the pyMIC Offload Module in PyFR

PyFR is an open-source high-order accurate computational fluid dynamics ...
research
09/18/2023

Comparing Performance and Portability between CUDA and SYCL for Protein Database Search on NVIDIA, AMD, and Intel GPUs

The heterogeneous computing paradigm has led to the need for portable an...
research
01/26/2022

Unlocking Personalized Healthcare on Modern CPUs/GPUs: Three-way Gene Interaction Study

Developments in Genome-Wide Association Studies have led to the increasi...
research
02/27/2020

Vortex: OpenCL Compatible RISC-V GPGPU

The current challenges in technology scaling are pushing the semiconduct...
research
02/24/2021

GPU-aware Communication with UCX in Parallel Programming Models: Charm++, MPI, and Python

As an increasing number of leadership-class systems embrace GPU accelera...

Please sign up or login with your details

Forgot password? Click here to reset