δ-SAM: Sharpness-Aware Minimization with Dynamic Reweighting

12/16/2021
by   Wenxuan Zhou, et al.
0

Deep neural networks are often overparameterized and may not easily achieve model generalization. Adversarial training has shown effectiveness in improving generalization by regularizing the change of loss on top of adversarially chosen perturbations. The recently proposed sharpness-aware minimization (SAM) algorithm adopts adversarial weight perturbation, encouraging the model to converging to a flat minima. Unfortunately, due to increased computational cost, adversarial weight perturbation can only be efficiently approximated per-batch instead of per-instance, leading to degraded performance. In this paper, we propose that dynamically reweighted perturbation within each batch, where unguarded instances are up-weighted, can serve as a better approximation to per-instance perturbation. We propose sharpness-aware minimization with dynamic reweighting (δ-SAM), which realizes the idea with efficient guardedness estimation. Experiments on the GLUE benchmark demonstrate the effectiveness of δ-SAM.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2022

Robust Weight Perturbation for Adversarial Training

Overfitting widely exists in adversarial robust training of deep network...
research
11/21/2022

Efficient Generalization Improvement Guided by Random Weight Perturbation

To fully uncover the great potential of deep neural networks (DNNs), var...
research
10/13/2022

GA-SAM: Gradient-Strength based Adaptive Sharpness-Aware Minimization for Improved Generalization

Recently, Sharpness-Aware Minimization (SAM) algorithm has shown state-o...
research
04/28/2023

An Adaptive Policy to Employ Sharpness-Aware Minimization

Sharpness-aware minimization (SAM), which searches for flat minima by mi...
research
10/11/2022

Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach

Deep neural networks often suffer from poor generalization caused by com...
research
12/09/2022

Adversarial Weight Perturbation Improves Generalization in Graph Neural Network

A lot of theoretical and empirical evidence shows that the flatter local...
research
10/07/2021

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Overparametrized Deep Neural Networks (DNNs) often achieve astounding pe...

Please sign up or login with your details

Forgot password? Click here to reset