A^2Q: Aggregation-Aware Quantization for Graph Neural Networks

by   Zeyu Zhu, et al.

As graph data size increases, the vast latency and memory consumption during inference pose a significant challenge to the real-world deployment of Graph Neural Networks (GNNs). While quantization is a powerful approach to reducing GNNs complexity, most previous works on GNNs quantization fail to exploit the unique characteristics of GNNs, suffering from severe accuracy degradation. Through an in-depth analysis of the topology of GNNs, we observe that the topology of the graph leads to significant differences between nodes, and most of the nodes in a graph appear to have a small aggregation value. Motivated by this, in this paper, we propose the Aggregation-Aware mixed-precision Quantization (A^2Q) for GNNs, where an appropriate bitwidth is automatically learned and assigned to each node in the graph. To mitigate the vanishing gradient problem caused by sparse connections between nodes, we propose a Local Gradient method to serve the quantization error of the node features as the supervision during training. We also develop a Nearest Neighbor Strategy to deal with the generalization on unseen graphs. Extensive experiments on eight public node-level and graph-level datasets demonstrate the generality and robustness of our proposed method. Compared to the FP32 models, our method can achieve up to a 18.6x (i.e., 1.70bit) compression ratio with negligible accuracy degradation. Morever, compared to the state-of-the-art quantization method, our method can achieve up to 11.4% and 9.5% accuracy improvements on the node-level and graph-level tasks, respectively, and up to 2x speedup on a dedicated hardware accelerator.


page 2

page 8

page 16

page 19

page 21

page 23

page 24

page 28


Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation

Graph Neural Network (GNN) training and inference involve significant ch...

Degree-Quant: Quantization-Aware Training for Graph Neural Networks

Graph neural networks (GNNs) have demonstrated strong performance on a w...

Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation

Graph Neural Networks (GNNs) have recently become popular for graph mach...

SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization

With the increasing popularity of graph-based learning, Graph Neural Net...

Scalable Adversarial Attack on Graph Neural Networks with Alternating Direction Method of Multipliers

Graph neural networks (GNNs) have achieved high performance in analyzing...

Activation Compression of Graph Neural Networks using Block-wise Quantization with Improved Variance Minimization

Efficient training of large-scale graph neural networks (GNNs) has been ...

An Efficient Index for Visual Search in Appearance-based SLAM

Vector-quantization can be a computationally expensive step in visual ba...

Please sign up or login with your details

Forgot password? Click here to reset