Best-Effort FPGA Programming: A Few Steps Can Go a Long Way

07/03/2018
by   Jason Cong, et al.
0

FPGA-based heterogeneous architectures provide programmers with the ability to customize their hardware accelerators for flexible acceleration of many workloads. Nonetheless, such advantages come at the cost of sacrificing programmability. FPGA vendors and researchers attempt to improve the programmability through high-level synthesis (HLS) technologies that can directly generate hardware circuits from high-level language descriptions. However, reading through recent publications on FPGA designs using HLS, one often gets the impression that FPGA programming is still hard in that it leaves programmers to explore a very large design space with many possible combinations of HLS optimization strategies. In this paper we make two important observations and contributions. First, we demonstrate a rather surprising result: FPGA programming can be made easy by following a simple best-effort guideline of five refinement steps using HLS. We show that for a broad class of accelerator benchmarks from MachSuite, the proposed best-effort guideline improves the FPGA accelerator performance by 42-29,030x. Compared to the baseline CPU performance, the FPGA accelerator performance is improved from an average 292.5x slowdown to an average 34.4x speedup. Moreover, we show that the refinement steps in the best-effort guideline, consisting of explicit data caching, customized pipelining, processing element duplication, computation/communication overlapping and scratchpad reorganization, correspond well to the best practice guidelines for multicore CPU programming. Although our best-effort guideline may not always lead to the optimal solution, it substantially simplifies the FPGA programming effort, and will greatly support the wide adoption of FPGA-based acceleration by the software programming community.

READ FULL TEXT
research
07/30/2018

AutoAccel: Automated Accelerator Generation and Optimization with Composable, Parallel and Pipeline Architecture

CPU-FPGA heterogeneous architectures are attracting ever-increasing atte...
research
02/25/2022

On The Design of a Light-weight FPGA Programming Framework for Graph Applications

FPGA accelerators designed for graph processing are gaining popularity. ...
research
04/06/2023

A computation of D(9) using FPGA Supercomputing

This preprint makes the claim of having computed the 9^th Dedekind Numbe...
research
08/20/2021

From Research to Proof-of-Concept: Analysis of a Deployment of FPGAs on a Commercial Search Engine

FPGAs are quickly becoming available in the cloud as a one more heteroge...
research
11/11/2016

Revisiting FPGA Acceleration of Molecular Dynamics Simulation with Dynamic Data Flow Behavior in High-Level Synthesis

Molecular dynamics (MD) simulation is one of the past decade's most impo...
research
02/09/2023

HERMES: qualification of High pErformance pRogrammable Microprocessor and dEvelopment of Software ecosystem

European efforts to boost competitiveness in the sector of space service...
research
11/20/2022

Best-Effort Communication Improves Performance and Scales Robustly on Conventional Hardware

Here, we test the performance and scalability of fully-asynchronous, bes...

Please sign up or login with your details

Forgot password? Click here to reset