DeepAI AI Chat
Log In Sign Up

Hardware Acceleration of HPC Computational Flow Dynamics using HBM-enabled FPGAs

01/05/2021
by   Tom Hogervorst, et al.
Delft University of Technology
0

Scientific computing is at the core of many High-Performance Computing applications, including computational flow dynamics. Because of the uttermost importance to simulate increasingly larger computational models, hardware acceleration is receiving increased attention due to its potential to maximize the performance of scientific computing. A Field-Programmable Gate Array is a reconfigurable hardware accelerator that is fully customizable in terms of computational resources and memory storage requirements of an application during its lifetime. Therefore, it is an ideal candidate to accelerate scientific computing applications because of the possibility to fully customize the memory hierarchy important in irregular applications such as iterative linear solvers found in scientific libraries. In this paper, we study the potential of using FPGA in HPC because of the rapid advances in reconfigurable hardware, such as the increase in on-chip memory size, increasing number of logic cells, and the integration of High-Bandwidth Memories on board. To perform this study, we first propose a novel ILU0 preconditioner tightly integrated with a BiCGStab solver kernel designed using a mixture of High-Level Synthesis and Register-Transfer Level hand-coded design. Second, we integrate the developed preconditioned iterative solver in Flow from the Open Porous Media (OPM) project, a state-of-the-art open-source reservoir simulator. Finally, we perform a thorough evaluation of the FPGA solver kernel in both standalone mode and integrated into the reservoir simulator that includes all the on-chip URAM and BRAM, on-board High-Bandwidth Memory, and off-chip CPU memory data transfers required in a complex simulator software such as OPM's Flow. We evaluate the performance on the Norne field, a real-world case reservoir model using a grid with more than 10^5 cells and using 3 unknowns per cell.

READ FULL TEXT

page 1

page 16

01/12/2023

RAD-Sim: Rapid Architecture Exploration for Novel Reconfigurable Acceleration Devices

With the continued growth in field-programmable gate array (FPGA) capaci...
04/07/2020

ESP4ML: Platform-Based Design of Systems-on-Chip for Embedded Machine Learning

We present ESP4ML, an open-source system-level design flow to build and ...
07/05/2022

Next-generation HPC models for future rotorcraft applications

Rotorcraft technologies pose great scientific and industrial challenges ...
10/04/2020

Exploring the acceleration of the Met Office NERC Cloud model using FPGAs

The use of Field Programmable Gate Arrays (FPGAs) to accelerate computat...
09/28/2022

Callipepla: Stream Centric Instruction Set and Mixed Precision for Accelerating Conjugate Gradient Solver

The continued growth in the processing power of FPGAs coupled with high ...
01/06/2020

Efficient Reordered Nonlinear Gauss-Seidel Solvers With Higher Order For Black-Oil Models

The fully implicit method is the most commonly used approach to solve bl...
06/26/2019

FPGA-based Multi-Chip Module for High-Performance Computing

Current integration, architectural design and manufacturing technologies...