Adversarial Weight Perturbation Improves Generalization in Graph Neural Network

12/09/2022
by   Yihan Wu, et al.
0

A lot of theoretical and empirical evidence shows that the flatter local minima tend to improve generalization. Adversarial Weight Perturbation (AWP) is an emerging technique to efficiently and effectively find such minima. In AWP we minimize the loss w.r.t. a bounded worst-case perturbation of the model parameters thereby favoring local minima with a small loss in a neighborhood around them. The benefits of AWP, and more generally the connections between flatness and generalization, have been extensively studied for i.i.d. data such as images. In this paper, we extensively study this phenomenon for graph data. Along the way, we first derive a generalization bound for non-i.i.d. node classification tasks. Then we identify a vanishing-gradient issue with all existing formulations of AWP and we propose a new Weighted Truncated AWP (WT-AWP) to alleviate this issue. We show that regularizing graph neural networks with WT-AWP consistently improves both natural and robust generalization across many different graph learning tasks and models.

READ FULL TEXT
research
10/28/2021

CAP: Co-Adversarial Perturbation on Weights and Features for Improving Generalization of Graph Neural Networks

Despite the recent advances of graph neural networks (GNNs) in modeling ...
research
10/10/2020

Regularizing Neural Networks via Adversarial Model Perturbation

Recent research has suggested that when training neural networks, flat l...
research
03/03/2023

Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization

Recently, flat minima are proven to be effective for improving generaliz...
research
12/16/2021

δ-SAM: Sharpness-Aware Minimization with Dynamic Reweighting

Deep neural networks are often overparameterized and may not easily achi...
research
02/09/2023

Generalization in Graph Neural Networks: Improved PAC-Bayesian Bounds on Graph Diffusion

Graph neural networks are widely used tools for graph prediction tasks. ...
research
07/18/2023

Promoting Exploration in Memory-Augmented Adam using Critical Momenta

Adaptive gradient-based optimizers, particularly Adam, have left their m...
research
02/20/2020

MaxUp: A Simple Way to Improve Generalization of Neural Network Training

We propose MaxUp, an embarrassingly simple, highly effective technique f...

Please sign up or login with your details

Forgot password? Click here to reset