High-Performance High-Order Stencil Computation on FPGAs Using OpenCL

02/14/2020
by   Hamid Reza Zohouri, et al.
0

In this paper we evaluate the performance of FPGAs for high-order stencil computation using High-Level Synthesis. We show that despite the higher computation intensity and on-chip memory requirement of such stencils compared to first-order ones, our design technique with combined spatial and temporal blocking remains effective. This allows us to reach similar, or even higher, compute performance compared to first-order stencils. We use an OpenCL-based design that, apart from parameterizing performance knobs, also parameterizes the stencil radius. Furthermore, we show that our performance model exhibits the same accuracy as first-order stencils in predicting the performance of high-order ones. On an Intel Arria 10 GX 1150 device, for 2D and 3D star-shaped stencils, we achieve over 700 and 270 GFLOP/s of compute performance, respectively, up to a stencil radius of four. These results outperform the state-of-the-art YASK framework on a modern Xeon for 2D and 3D stencils, and outperform a modern Xeon Phi for 2D stencils, while achieving competitive performance in 3D. Furthermore, our FPGA design achieves better power efficiency in almost all cases.

READ FULL TEXT

page 1

page 6

research
02/01/2018

Combined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL

Recent developments in High Level Synthesis tools have attracted softwar...
research
10/23/2018

High Performance Computing with FPGAs and OpenCL

In this work we evaluate the potential of FPGAs for accelerating HPC wor...
research
03/02/2021

Scalable communication for high-order stencil computations using CUDA-aware MPI

Modern compute nodes in high-performance computing provide a tremendous ...
research
01/06/2020

AN5D: Automated Stencil Framework for High-Degree Temporal Blocking on GPUs

Stencil computation is one of the most widely-used compute patterns in h...
research
02/18/2020

High-Order Paired-ASPP Networks for Semantic Segmenation

Current semantic segmentation models only exploit first-order statistics...
research
07/16/2020

Kronecker Attention Networks

Attention operators have been applied on both 1-D data like texts and hi...
research
03/10/2022

Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA

Temporal Graph Neural Networks (TGNNs) are powerful models to capture te...

Please sign up or login with your details

Forgot password? Click here to reset