Studying the Potential of Automatic Optimizations in the Intel FPGA SDK for OpenCL

01/10/2022
by   Adel Ejjeh, et al.
0

High Level Synthesis (HLS) tools, like the Intel FPGA SDK for OpenCL, improve design productivity and enable efficient design space exploration guided by simple program directives (pragmas), but may sometimes miss important optimizations necessary for high performance. In this paper, we present a study of the tradeoffs in HLS optimizations, and the potential of a modern HLS tool in automatically optimizing an application. We perform the study on a 5-stage camera ISP pipeline using the Intel FPGA SDK for OpenCL and an Arria 10 FPGA Dev Kit. We show that automatic optimizations in the HLS tool are valuable, achieving a up to 2.7X speedup over equivalent CPU execution. With further hand tuning, however, we can achieve up to 36.5X speedup over CPU. We draw several specific lessons about the effectiveness of automatic optimizations guided by simple directives, and the nature of manual rewriting required for high performance.

READ FULL TEXT

page 3

page 6

research
02/01/2018

Combined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL

Recent developments in High Level Synthesis tools have attracted softwar...
research
11/11/2016

Revisiting FPGA Acceleration of Molecular Dynamics Simulation with Dynamic Data Flow Behavior in High-Level Synthesis

Molecular dynamics (MD) simulation is one of the past decade's most impo...
research
01/18/2023

Hide and Seek with Spectres: Efficient discovery of speculative information leaks with random testing

Attacks like Spectre abuse speculative execution, one of the key perform...
research
09/23/2020

Applying the Roofline model for Deep Learning performance optimizations

In this paper We present a methodology for creating Roofline models auto...
research
02/05/2020

MKPipe: A Compiler Framework for Optimizing Multi-Kernel Workloads in OpenCL for FPGA

OpenCL for FPGA enables developers to design FPGAs using a programming m...
research
04/23/2020

Evaluating FPGA Accelerator Performance with a Parameterized OpenCL Adaptation of the HPCChallenge Benchmark Suite

FPGAs have found increasing adoption in data center applications since a...
research
04/10/2018

Implementing Push-Pull Efficiently in GraphBLAS

We factor Beamer's push-pull, also known as direction-optimized breadth-...

Please sign up or login with your details

Forgot password? Click here to reset