Towards Automatic Transformation of Legacy Scientific Code into OpenCL for Optimal Performance on FPGAs

12/21/2018
by   Wim Vanderbauwhede, et al.
0

There is a large body of legacy scientific code written in languages like Fortran that is not optimised to get the best performance out of heterogeneous acceleration devices like GPUs and FPGAs, and manually porting such code into parallel languages frameworks like OpenCL requires considerable effort. We are working towards developing a turn-key, self-optimising compiler for accelerating scientific applications, that can automatically transform legacy code into a solution for heterogeneous targets. In this paper we focus on FPGAs as the acceleration devices, and carry out our discussion in the context of the OpenCL programming framework. We show a route to automatic creation of kernels which are optimised for execution in a "streaming" fashion, which gives optimal performance on FPGAs. We use a 2D shallow-water model as an illustration; specifically we show how the use of channels to communicate directly between peer kernels and the use of on-chip memory to create stencil buffers can lead to significant performance improvements. Our results show better FPGA performance against a baseline CPU implementation, and better energy-efficiency against both CPU and GPU implementations.

READ FULL TEXT
research
11/13/2017

Domain-Specific Acceleration and Auto-Parallelization of Legacy Scientific Code in FORTRAN 77 using Source-to-Source Compilation

Massively parallel accelerators such as GPGPUs, manycores and FPGAs repr...
research
08/26/2019

AccD: A Compiler-based Framework for Accelerating Distance-related Algorithms on CPU-FPGA Platforms

As a promising solution to boost the performance of distance-related alg...
research
10/30/2020

Transparent Compiler and Runtime Specializations for Accelerating Managed Languages on FPGAs

In recent years, heterogeneous computing has emerged as the vital way to...
research
09/21/2023

AIM: Accelerating Arbitrary-precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP

Arbitrary-precision integer multiplication is the core kernel of many ap...
research
08/29/2022

Improving the Efficiency of OpenCL Kernels through Pipes

In an effort to lower the barrier to the adoption of FPGAs by a broader ...
research
05/15/2016

A Foray into Efficient Mapping of Algorithms to Hardware Platforms on Heterogeneous Systems

Heterogeneous computing can potentially offer significant performance an...
research
11/06/2017

Comparison of Parallelisation Approaches, Languages, and Compilers for Unstructured Mesh Algorithms on GPUs

Efficiently exploiting GPUs is increasingly essential in scientific comp...

Please sign up or login with your details

Forgot password? Click here to reset