Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

06/24/2020
by   Weilin Cong, et al.
22

Sampling methods (e.g., node-wise, layer-wise, or subgraph) has become an indispensable strategy to speed up training large-scale Graph Neural Networks (GNNs). However, existing sampling methods are mostly based on the graph structural information and ignore the dynamicity of optimization, which leads to high variance in estimating the stochastic gradients. The high variance issue can be very pronounced in extremely large graphs, where it results in slow convergence and poor generalization. In this paper, we theoretically analyze the variance of sampling methods and show that, due to the composite structure of empirical risk, the variance of any sampling method can be decomposed into embedding approximation variance in the forward stage and stochastic gradient variance in the backward stage that necessities mitigating both types of variance to obtain faster convergence rate. We propose a decoupled variance reduction strategy that employs (approximate) gradient information to adaptively sample nodes with minimal variance, and explicitly reduces the variance introduced by embedding approximation. We show theoretically and empirically that the proposed method, even with smaller mini-batch sizes, enjoys a faster convergence rate and entails a better generalization compared to the existing methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/03/2021

On the Importance of Sampling in Learning Graph Convolutional Networks

Graph Convolutional Networks (GCNs) have achieved impressive empirical a...
research
06/10/2020

Bandit Samplers for Training Graph Neural Networks

Several sampling algorithms with variance reduction have been proposed f...
research
12/18/2022

Influence-Based Mini-Batching for Graph Neural Networks

Using graph neural networks for large graphs is challenging since there ...
research
02/02/2023

LMC: Fast Training of GNNs via Subgraph Sampling with Provable Convergence

The message passing-based graph neural networks (GNNs) have achieved gre...
research
12/07/2020

Learning Graph Neural Networks with Approximate Gradient Descent

The first provably efficient algorithm for learning graph neural network...
research
04/22/2022

Distributed stochastic projection-free solver for constrained optimization

This paper proposes a distributed stochastic projection-free algorithm f...
research
08/07/2018

Fast Variance Reduction Method with Stochastic Batch Size

In this paper we study a family of variance reduction methods with rando...

Please sign up or login with your details

Forgot password? Click here to reset