Staleness-Alleviated Distributed GNN Training via Online Dynamic-Embedding Prediction

08/25/2023
by   Guangji Bai, et al.
0

Despite the recent success of Graph Neural Networks (GNNs), it remains challenging to train GNNs on large-scale graphs due to neighbor explosions. As a remedy, distributed computing becomes a promising solution by leveraging abundant computing resources (e.g., GPU). However, the node dependency of graph data increases the difficulty of achieving high concurrency in distributed GNN training, which suffers from the massive communication overhead. To address it, Historical value approximation is deemed a promising class of distributed training techniques. It utilizes an offline memory to cache historical information (e.g., node embedding) as an affordable approximation of the exact value and achieves high concurrency. However, such benefits come at the cost of involving dated training information, leading to staleness, imprecision, and convergence issues. To overcome these challenges, this paper proposes SAT (Staleness-Alleviated Training), a novel and scalable distributed GNN training framework that reduces the embedding staleness adaptively. The key idea of SAT is to model the GNN's embedding evolution as a temporal graph and build a model upon it to predict future embedding, which effectively alleviates the staleness of the cached historical embedding. We propose an online algorithm to train the embedding predictor and the distributed GNN alternatively and further provide a convergence analysis. Empirically, we demonstrate that SAT can effectively reduce embedding staleness and thus achieve better performance and convergence speed on multiple large-scale graph datasets.

READ FULL TEXT

page 6

page 7

research
11/01/2022

Distributed Graph Neural Network Training: A Survey

Graph neural networks (GNNs) are a type of deep learning models that lea...
research
05/31/2022

Distributed Graph Neural Network Training with Periodic Historical Embedding Synchronization

Despite the recent success of Graph Neural Networks (GNNs), it remains c...
research
11/10/2022

A Comprehensive Survey on Distributed Training of Graph Neural Networks

Graph neural networks (GNNs) have been demonstrated to be a powerful alg...
research
12/16/2021

BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

Graph neural networks (GNNs) have extended the success of deep neural ne...
research
06/21/2022

Nimble GNN Embedding with Tensor-Train Decomposition

This paper describes a new method for representing embedding tables of g...
research
01/18/2023

ReFresh: Reducing Memory Access from Exploiting Stable Historical Embeddings for Graph Neural Network Training

A key performance bottleneck when training graph neural network (GNN) mo...
research
10/31/2022

GNN at the Edge: Cost-Efficient Graph Neural Network Processing over Distributed Edge Servers

Edge intelligence has arisen as a promising computing paradigm for suppo...

Please sign up or login with your details

Forgot password? Click here to reset