DeepAI AI Chat
Log In Sign Up

SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization

by   Boyuan Feng, et al.
Nanjing University of Aeronautics and Astronautics
The Regents of the University of California

With the increasing popularity of graph-based learning, Graph Neural Networks (GNNs) win lots of attention from the research and industry field because of their high accuracy. However, existing GNNs suffer from high memory footprints (e.g., node embedding features). This high memory footprint hurdles the potential applications towards memory-constrained devices, such as the widely-deployed IoT devices. To this end, we propose a specialized GNN quantization scheme, SGQuant, to systematically reduce the GNN memory consumption. Specifically, we first propose a GNN-tailored quantization algorithm design and a GNN quantization fine-tuning scheme to reduce memory consumption while maintaining accuracy. Then, we investigate the multi-granularity quantization strategy that operates at different levels (components, graph topology, and layers) of GNN computation. Moreover, we offer an automatic bit-selecting (ABS) to pinpoint the most appropriate quantization bits for the above multi-granularity quantizations. Intensive experiments show that SGQuant can effectively reduce the memory footprint from 4.25x to 31.9x compared with the original full-precision GNNs while limiting the accuracy drop to 0.4


page 1

page 2

page 4


BiFeat: Supercharge GNN Training via Graph Feature Quantization

Graph Neural Networks (GNNs) is a promising approach for applications wi...

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs

As the emerging trend of the graph-based deep learning, Graph Neural Net...

Boosting Distributed Full-graph GNN Training with Asynchronous One-bit Communication

Training Graph Neural Networks (GNNs) on large graphs is challenging due...

A^2Q: Aggregation-Aware Quantization for Graph Neural Networks

As graph data size increases, the vast latency and memory consumption du...

Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training

Distributed full-graph training of Graph Neural Networks (GNNs) over lar...

BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization

Neural networks have demonstrably achieved state-of-the art accuracy usi...

ClusterGNN: Cluster-based Coarse-to-Fine Graph Neural Network for Efficient Feature Matching

Graph Neural Networks (GNNs) with attention have been successfully appli...