FSHMEM: Supporting Partitioned Global Address Space on FPGAs for Large-Scale Hardware Acceleration Infrastructure

07/11/2022
by   Yashael Faith Arthanto, et al.
0

By providing highly efficient one-sided communication with globally shared memory space, Partitioned Global Address Space (PGAS) has become one of the most promising parallel computing models in high-performance computing (HPC). Meanwhile, FPGA is getting attention as an alternative compute platform for HPC systems with the benefit of custom computing and design flexibility. However, the exploration of PGAS has not been conducted on FPGAs, unlike the traditional message passing interface. This paper proposes FSHMEM, a software/hardware framework that enables the PGAS programming model on FPGAs. We implement the core functions of GASNet specification on FPGA for native PGAS integration in hardware, while its programming interface is designed to be highly compatible with legacy software. Our experiments show that FSHMEM achieves the peak bandwidth of 3813 MB/s, which is more than 95 outperforming the prior works by 9.5×. It records 0.35us and 0.59us latency for remote write and read operations, respectively. Finally, we conduct a case study on the two Intel D5005 FPGA nodes integrating Intel's deep learning accelerator. The two-node system programmed by FSHMEM achieves 1.94× and 1.98× speedup for matrix multiplication and convolution operation, respectively, showing its scalability potential for HPC infrastructure.

READ FULL TEXT

page 1

page 3

research
09/07/2019

Streaming Message Interface: High-Performance Distributed Memory Programming on Reconfigurable Hardware

Distributed memory programming is the established paradigm used in high-...
research
12/19/2021

FSpGEMM: An OpenCL-based HPC Framework for Accelerating General Sparse Matrix-Matrix Multiplication on FPGAs

General sparse matrix-matrix multiplication (SpGEMM) is an integral part...
research
08/21/2023

CXL Memory as Persistent Memory for Disaggregated HPC: A Practical Approach

In the landscape of High-Performance Computing (HPC), the quest for effi...
research
11/04/2022

An Efficient FPGA-based Accelerator for Deep Forest

Deep Forest is a prominent machine learning algorithm known for its high...
research
02/06/2021

A Newcomer In The PGAS World – UPC++ vs UPC: A Comparative Study

A newcomer in the Partitioned Global Address Space (PGAS) 'world' has ar...
research
03/21/2022

Towards integrating hardware Data Plane acceleration in Network Functions Virtualization

This paper proposes a framework for integrating data plane (DP) accelera...

Please sign up or login with your details

Forgot password? Click here to reset