Accelerating Large Scale Real-Time GNN Inference using Channel Pruning

05/10/2021
by   Hongkuan Zhou, et al.
0

Graph Neural Networks (GNNs) are proven to be powerful models to generate node embedding for downstream applications. However, due to the high computation complexity of GNN inference, it is hard to deploy GNNs for large-scale or real-time applications. In this paper, we propose to accelerate GNN inference by pruning the dimensions in each layer with negligible accuracy loss. Our pruning framework uses a novel LASSO regression formulation for GNNs to identify feature dimensions (channels) that have high influence on the output activation. We identify two inference scenarios and design pruning schemes based on their computation and memory usage for each. To further reduce the inference complexity, we effectively store and reuse hidden features of visited nodes, which significantly reduces the number of supporting nodes needed to compute the target embedding. We evaluate the proposed method with the node classification problem on five popular datasets and a real-time spam detection application. We demonstrate that the pruned GNN models greatly reduce computation and memory usage with little accuracy loss. For full inference, the proposed method achieves an average of 3.27x speedup with only 0.002 drop in F1-Micro on GPU. For batched inference, the proposed method achieves an average of 6.67x speedup with only 0.003 drop in F1-Micro on CPU. To the best of our knowledge, we are the first to accelerate large scale real-time GNN inference through channel pruning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/29/2022

BiFeat: Supercharge GNN Training via Graph Feature Quantization

Graph Neural Networks (GNNs) is a promising approach for applications wi...
research
03/01/2023

HyScale-GNN: A Scalable Hybrid GNN Training System on Single-Node Heterogeneous Architecture

Graph Neural Networks (GNNs) have shown success in many real-world appli...
research
07/18/2022

Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks

Graph Neural Networks (GNNs) tend to suffer from high computation costs ...
research
11/01/2022

Efficient Graph Neural Network Inference at Large Scale

Graph neural networks (GNNs) have demonstrated excellent performance in ...
research
05/11/2023

Graph Neural Network for Accurate and Low-complexity SAR ATR

Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) is the...
research
01/04/2023

Accurate, Low-latency, Efficient SAR Automatic Target Recognition on FPGA

Synthetic aperture radar (SAR) automatic target recognition (ATR) is the...
research
09/20/2023

InkStream: Real-time GNN Inference on Streaming Graphs via Incremental Update

Classic Graph Neural Network (GNN) inference approaches, designed for st...

Please sign up or login with your details

Forgot password? Click here to reset