GraphACT: Accelerating GCN Training on CPU-FPGA Heterogeneous Platforms

12/31/2019
by   Hanqing Zeng, et al.
0

Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art deep learning model for representation learning on graphs. It is challenging to accelerate training of GCNs, due to (1) substantial and irregular data communication to propagate information within the graph, and (2) intensive computation to propagate information along the neural network layers. To address these challenges, we design a novel accelerator for training GCNs on CPU-FPGA heterogeneous systems, by incorporating multiple algorithm-architecture co-optimizations. We first analyze the computation and communication characteristics of various GCN training algorithms, and select a subgraph-based algorithm that is well suited for hardware execution. To optimize the feature propagation within subgraphs, we propose a lightweight pre-processing step based on a graph theoretic approach. Such pre-processing performed on the CPU significantly reduces the memory access requirements and the computation to be performed on the FPGA. To accelerate the weight update in GCN layers, we propose a systolic array based design for efficient parallelization. We integrate the above optimizations into a complete hardware pipeline, and analyze its load-balance and resource utilization by accurate performance modeling. We evaluate our design on a Xilinx Alveo U200 board hosted by a 40-core Xeon server. On three large graphs, we achieve an order of magnitude training speedup with negligible accuracy loss, compared with state-of-the-art implementation on a multi-core platform.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/10/2021

SPA-GCN: Efficient and Flexible GCN Accelerator with an Application for Graph Similarity Computation

While there have been many studies on hardware acceleration for deep lea...
research
11/01/2021

GCNear: A Hybrid Architecture for Efficient GCN Training with Near-Memory Processing

Recently, Graph Convolutional Networks (GCNs) have become state-of-the-a...
research
10/28/2018

Accurate, Efficient and Scalable Graph Embedding

The Graph Convolutional Network (GCN) model and its variants are powerfu...
research
04/14/2023

LightRW: FPGA Accelerated Graph Dynamic Random Walks

Graph dynamic random walks (GDRWs) have recently emerged as a powerful p...
research
12/22/2022

Accelerating Barnes-Hut t-SNE Algorithm by Efficient Parallelization on Multi-Core CPUs

t-SNE remains one of the most popular embedding techniques for visualizi...
research
02/20/2019

DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators

The convolutional neural network (CNN) has become a state-of-the-art met...
research
05/22/2019

KPynq: A Work-Efficient Triangle-Inequality based K-means on FPGA

K-means is a popular but computation-intensive algorithm for unsupervise...

Please sign up or login with your details

Forgot password? Click here to reset