GraVF-M: Graph Processing System Generation for Multi-FPGA Platforms

10/14/2019
by   Nina Engelhardt, et al.
0

Due to the irregular nature of connections in most graph datasets, partitioning graph analysis algorithms across multiple computational nodes that do not share a common memory inevitably leads to large amounts of interconnect traffic. Previous research has shown that FPGAs can outcompete software-based graph processing in shared memory contexts, but it remains an open question if this advantage can be maintained in distributed systems. In this work, we present GraVF-M, a framework designed to ease the implementation of FPGA-based graph processing accelerators for multi-FPGA platforms with distributed memory. Based on a lightweight description of the algorithm kernel, the framework automatically generates optimized RTL code for the whole multi-FPGA design. We exploit an aspect of the programming model to present a familiar message-passing paradigm to the user, while under the hood implementing a more efficient architecture that can reduce the necessary inter-FPGA network traffic by a factor equal to the average degree of the input graph. A performance model based on a theoretical analysis of the factors influencing performance serves to evaluate the efficiency of our implementation. With a throughput of up to 5.8 GTEPS (billions of traversed edges per second) on a 4-FPGA system, the designs generated by GraVF-M compare favorably to state-of-the-art frameworks from the literature and reach 94 the projected performance limit of the system.

READ FULL TEXT

page 25

page 26

research
12/03/2018

Programming Strategies for Irregular Algorithms on the Emu Chick

The Emu Chick prototype implements migratory memory-side processing in a...
research
01/15/2020

Optimized implementation of the conjugate gradient algorithm for FPGA-based platforms using the Dirac-Wilson operator as an example

It is now a noticeable trend in High Performance Computing that the syst...
research
12/22/2021

HP-GNN: Generating High Throughput GNN Training Implementation on CPU-FPGA Heterogeneous Platform

Graph Neural Networks (GNNs) have shown great success in many applicatio...
research
03/23/2020

A distributed memory, local configuration technique for re-configurable logic designs

The use and location of memory in integrated circuits plays a key factor...
research
09/26/2020

An OpenCL 3D FFT for Molecular Dynamics Distributed Across Multiple FPGAs

3D FFTs are used to accelerate MD electrostatic forces computations but ...
research
10/28/2020

Substream-Centric Maximum Matchings on FPGA

Developing high-performance and energy-efficient algorithms for maximum ...
research
11/12/2018

Simple FPGA routing graph compression

Modern FPGAs continue to increase in capacity which requires more memory...

Please sign up or login with your details

Forgot password? Click here to reset