An Experimental Comparison of Partitioning Strategies for Distributed Graph Neural Network Training

08/29/2023
by   Nikolai Merkel, et al.
0

Recently, graph neural networks (GNNs) have gained much attention as a growing area of deep learning capable of learning on graph-structured data. However, the computational and memory requirements for training GNNs on large-scale graphs can exceed the capabilities of single machines or GPUs, making distributed GNN training a promising direction for large-scale GNN training. A prerequisite for distributed GNN training is to partition the input graph into smaller parts that are distributed among multiple machines of a compute cluster. Although graph partitioning has been extensively studied with regard to graph analytics and graph databases, its effect on GNN training performance is largely unexplored. In this paper, we study the effectiveness of graph partitioning for distributed GNN training. Our study aims to understand how different factors such as GNN parameters, mini-batch size, graph type, features size, and scale-out factor influence the effectiveness of graph partitioning. We conduct experiments with two different GNN systems using vertex and edge partitioning. We found that graph partitioning is a crucial pre-processing step that can heavily reduce the training time and memory footprint. Furthermore, our results show that invested partitioning time can be amortized by reduced GNN training, making it a relevant optimization.

READ FULL TEXT

page 9

page 10

page 11

research
11/01/2022

Distributed Graph Neural Network Training: A Survey

Graph neural networks (GNNs) are a type of deep learning models that lea...
research
04/21/2021

GraphTheta: A Distributed Graph Neural Network Learning System With Flexible Training Strategy

Graph neural networks (GNNs) have been demonstrated as a powerful tool f...
research
08/06/2023

Communication-Free Distributed GNN Training with Vertex Cut

Training Graph Neural Networks (GNNs) on real-world graphs consisting of...
research
11/11/2021

Sequential Aggregation and Rematerialization: Distributed Full-batch Training of Graph Neural Networks on Large Graphs

We present the Sequential Aggregation and Rematerialization (SAR) scheme...
research
01/31/2022

SUGAR: Efficient Subgraph-level Training via Resource-aware Graph Partitioning

Graph Neural Networks (GNNs) have demonstrated a great potential in a va...
research
10/11/2020

DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs

Graph neural networks (GNN) have shown great success in learning from gr...

Please sign up or login with your details

Forgot password? Click here to reset