Resource-Aware Just-in-Time OpenCL Compiler for Coarse-Grained FPGA Overlays

05/08/2017
by   Abhishek Kumar Jain, et al.
0

FPGA vendors have recently started focusing on OpenCL for FPGAs because of its ability to leverage the parallelism inherent to heterogeneous computing platforms. OpenCL allows programs running on a host computer to launch accelerator kernels which can be compiled at run-time for a specific architecture, thus enabling portability. However, the prohibitive compilation times (specifically the FPGA place and route times) are a major stumbling block when using OpenCL tools from FPGA vendors. The long compilation times mean that the tools cannot effectively use just-in-time (JIT) compilation or runtime performance scaling. Coarse-grained overlays represent a possible solution by virtue of their coarse granularity and fast compilation. In this paper, we present a methodology for run-time compilation of OpenCL kernels to a DSP block based coarse-grained overlay, rather than directly to the fine-grained FPGA fabric. The proposed methodology allows JIT compilation and on-demand resource-aware kernel replication to better utilize available overlay resources, raising the abstraction level while reducing compile times significantly. We further demonstrate that this approach can even be used for run-time compilation of OpenCL kernels on the ARM processor of the embedded heterogeneous Zynq device.

READ FULL TEXT
research
06/21/2016

An Area-Efficient FPGA Overlay using DSP Block based Time-multiplexed Functional Units

Coarse grained overlay architectures improve FPGA design productivity by...
research
09/12/2023

Just-in-Time autotuning

Performance portability is a major concern on current architectures. One...
research
10/27/2021

Xar-Trek: Run-time Execution Migration among FPGAs and Heterogeneous-ISA CPUs

Datacenter servers are increasingly heterogeneous: from x86 host CPUs, t...
research
02/16/2017

Benchmarking the computing resources at the Instituto de Astrofísica de Canarias

The aim of this study is the characterization of the computing resources...
research
08/27/2015

Automatic Nested Loop Acceleration on FPGAs Using Soft CGRA Overlay

Offloading compute intensive nested loops to execute on FPGA accelerator...
research
11/22/2019

SIFO: Secure Computational Infrastructure using FPGA Overlays

Secure Function Evaluation (SFE) has received recent attention due to th...
research
01/30/2019

Generic Connectivity-Based CGRA Mapping via Integer Linear Programming

Coarse-grained reconfigurable architectures (CGRAs) are programmable log...

Please sign up or login with your details

Forgot password? Click here to reset