TinyKG: Memory-Efficient Training Framework for Knowledge Graph Neural Recommender Systems

12/08/2022
by   Huiyuan Chen, et al.
0

There has been an explosion of interest in designing various Knowledge Graph Neural Networks (KGNNs), which achieve state-of-the-art performance and provide great explainability for recommendation. The promising performance is mainly resulting from their capability of capturing high-order proximity messages over the knowledge graphs. However, training KGNNs at scale is challenging due to the high memory usage. In the forward pass, the automatic differentiation engines (e.g., TensorFlow/PyTorch) generally need to cache all intermediate activation maps in order to compute gradients in the backward pass, which leads to a large GPU memory footprint. Existing work solves this problem by utilizing multi-GPU distributed frameworks. Nonetheless, this poses a practical challenge when seeking to deploy KGNNs in memory-constrained environments, especially for industry-scale graphs. Here we present TinyKG, a memory-efficient GPU-based training framework for KGNNs for the tasks of recommendation. Specifically, TinyKG uses exact activations in the forward pass while storing a quantized version of activations in the GPU buffers. During the backward pass, these low-precision activations are dequantized back to full-precision tensors, in order to compute gradients. To reduce the quantization errors, TinyKG applies a simple yet effective quantization algorithm to compress the activations, which ensures unbiasedness with low variance. As such, the training memory footprint of KGNNs is largely reduced with negligible accuracy loss. To evaluate the performance of our TinyKG, we conduct comprehensive experiments on real-world datasets. We found that our TinyKG with INT2 quantization aggressively reduces the memory footprint of activation maps with 7 ×, only with 2% loss in accuracy, allowing us to deploy KGNNs on memory-constrained devices.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/23/2019

Backprop with Approximate Activations for Memory-efficient Network Training

Larger and deeper neural network architectures deliver improved accuracy...
research
09/21/2023

Activation Compression of Graph Neural Networks using Block-wise Quantization with Improved Variance Minimization

Efficient training of large-scale graph neural networks (GNNs) has been ...
research
11/22/2021

Mesa: A Memory-saving Training Framework for Transformers

There has been an explosion of interest in designing high-performance Tr...
research
04/29/2021

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

The increasing size of neural network models has been critical for impro...
research
03/02/2023

Boosting Distributed Full-graph GNN Training with Asynchronous One-bit Communication

Training Graph Neural Networks (GNNs) on large graphs is challenging due...
research
09/04/2017

WRPN: Wide Reduced-Precision Networks

For computer vision applications, prior works have shown the efficacy of...
research
06/05/2022

Learning Binarized Graph Representations with Multi-faceted Quantization Reinforcement for Top-K Recommendation

Learning vectorized embeddings is at the core of various recommender sys...

Please sign up or login with your details

Forgot password? Click here to reset