GraphTheta: A Distributed Graph Neural Network Learning System With Flexible Training Strategy

by   Houyi Li, et al.

Graph neural networks (GNNs) have been demonstrated as a powerful tool for analysing non-Euclidean graph data. However, the lack of efficient distributed graph learning systems severely hinders applications of GNNs, especially when graphs are big, of high density or with highly skewed node degree distributions. In this paper, we present a new distributed graph learning system GraphTheta, which supports multiple training strategies and enables efficient and scalable learning on big graphs. GraphTheta implements both localized and globalized graph convolutions on graphs, where a new graph learning abstraction NN-TGAR is designed to bridge the gap between graph processing and graph learning frameworks. A distributed graph engine is proposed to conduct the stochastic gradient descent optimization with hybrid-parallel execution. Moreover, we add support for a new cluster-batched training strategy in addition to the conventional global-batched and mini-batched ones. We evaluate GraphTheta using a number of network data with network size ranging from small-, modest- to large-scale. Experimental results show that GraphTheta scales almost linearly to 1,024 workers and trains an in-house developed GNN model within 26 hours on Alipay dataset of 1.4 billion nodes and 4.1 billion attributed edges. Moreover, GraphTheta also obtains better prediction results than the state-of-the-art GNN methods. To the best of our knowledge, this work represents the largest edge-attributed GNN learning task conducted on a billion-scale network in the literature.


page 1

page 2

page 3

page 4


Increase and Conquer: Training Graph Neural Networks on Growing Graphs

Graph neural networks (GNNs) use graph convolutions to exploit network i...

Characterizing and Understanding Distributed GNN Training on GPUs

Graph neural network (GNN) has been demonstrated to be a powerful model ...

EnGN: A High-Throughput and Energy-Efficient Accelerator for Large Graph Neural Networks

Inspired by the great success of convolutional neural networks on struct...

Sequential Aggregation and Rematerialization: Distributed Full-batch Training of Graph Neural Networks on Large Graphs

We present the Sequential Aggregation and Rematerialization (SAR) scheme...

Stochastic Graph Neural Networks

Graph neural networks (GNNs) model nonlinear representations in graph da...

AliGraph: A Comprehensive Graph Neural Network Platform

An increasing number of machine learning tasks require dealing with larg...

SmartSAGE: Training Large-scale Graph Neural Networks using In-Storage Processing Architectures

Graph neural networks (GNNs) can extract features by learning both the r...