C-for-Metal: High Performance SIMD Programming on Intel GPUs

01/26/2021
by   Guei-Yuan Lueh, et al.
0

The SIMT execution model is commonly used for general GPU development. CUDA and OpenCL developers write scalar code that is implicitly parallelized by compiler and hardware. On Intel GPUs, however, this abstraction has profound performance implications as the underlying ISA is SIMD and important hardware capabilities cannot be fully utilized. To close this performance gap we introduce C-For-Metal (CM), an explicit SIMD programming framework designed to deliver close-to-the-metal performance on Intel GPUs. The CM programming language and its vector/matrix types provide an intuitive interface to exploit the underlying hardware features, allowing fine-grained register management, SIMD size control and cross-lane data sharing. Experimental results show that CM applications from different domains outperform the best-known SIMT-based OpenCL implementations, achieving up to 2.7x speedup on the latest Intel GPU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2022

Black-Scholes Option Pricing on Intel CPUs and GPUs: Implementation on SYCL and Optimization Techniques

The Black-Scholes option pricing problem is one of the widely used finan...
research
10/29/2020

Systolic Computing on GPUs for Productive Performance

We propose a language and compiler to productively build high-performanc...
research
08/15/2018

libhclooc: Software Library Facilitating Out-of-core Implementations of Accelerator Kernels on Hybrid Computing Platforms

Hardware accelerators such as Graphics Processing Units (GPUs), Intel Xe...
research
01/26/2022

Unlocking Personalized Healthcare on Modern CPUs/GPUs: Three-way Gene Interaction Study

Developments in Genome-Wide Association Studies have led to the increasi...
research
12/10/2015

Grid: A next generation data parallel C++ QCD library

In this proceedings we discuss the motivation, implementation details, a...
research
09/14/2021

Measurement and Analysis of GPU-accelerated Applications with HPCToolkit

To address the challenge of performance analysis on the US DOE's forthco...
research
09/13/2021

Specifying and Testing GPU Workgroup Progress Models

As GPU availability has increased and programming support has matured, a...

Please sign up or login with your details

Forgot password? Click here to reset