Scaling Graph Neural Networks with Approximate PageRank

by   Aleksandar Bojchevski, et al.

Graph neural networks (GNNs) have emerged as a powerful approach for solving many network mining tasks. However, learning on large graphs remains a challenge - many recently proposed scalable GNN approaches rely on an expensive message-passing procedure to propagate information through the graph. We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs resulting in significant speed gains while maintaining state-of-the-art prediction performance. In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings. We demonstrate that PPRGo outperforms baselines in both distributed and single-machine training environments on a number of commonly used academic graphs. To better analyze the scalability of large-scale graph learning methods, we introduce a novel benchmark graph with 12.4 million nodes, 173 million edges, and 2.8 million node features. We show that training PPRGo from scratch and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph. We discuss the practical application of PPRGo to solve large-scale node classification problems at Google.



page 1

page 2

page 3

page 4


Neural Trees for Learning on Graphs

Graph Neural Networks (GNNs) have emerged as a flexible and powerful app...

Scalable Consistency Training for Graph Neural Networks via Self-Ensemble Self-Distillation

Consistency training is a popular method to improve deep learning models...

Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine

Graph Neural Networks (GNNs) have emerged as a powerful model for ML ove...

Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

Many widely used datasets for graph machine learning tasks have generall...

Scalable Graph Neural Network Training: The Case for Sampling

Graph Neural Networks (GNNs) are a new and increasingly popular family o...

TGL: A General Framework for Temporal GNN Training on Billion-Scale Graphs

Many real world graphs contain time domain information. Temporal Graph N...

OpenGraphGym-MG: Using Reinforcement Learning to Solve Large Graph Optimization Problems on MultiGPU Systems

Large scale graph optimization problems arise in many fields. This paper...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.