Improving Generalization via Uncertainty Driven Perturbations

02/11/2022
by   Matteo Pagliardini, et al.
7

Recently Shah et al., 2020 pointed out the pitfalls of the simplicity bias - the tendency of gradient-based algorithms to learn simple models - which include the model's high sensitivity to small input perturbations, as well as sub-optimal margins. In particular, while Stochastic Gradient Descent yields max-margin boundary on linear models, such guarantee does not extend to non-linear models. To mitigate the simplicity bias, we consider uncertainty-driven perturbations (UDP) of the training data points, obtained iteratively by following the direction that maximizes the model's estimated uncertainty. Unlike loss-driven perturbations, uncertainty-guided perturbations do not cross the decision boundary, allowing for using a larger range of values for the hyperparameter that controls the magnitude of the perturbation. Moreover, as real-world datasets have non-isotropic distances between data points of different classes, the above property is particularly appealing for increasing the margin of the decision boundary, which in turn improves the model's generalization. We show that UDP is guaranteed to achieve the maximum margin decision boundary on linear models and that it notably increases it on challenging simulated datasets. Interestingly, it also achieves competitive loss-based robustness and generalization trade-off on several datasets.

READ FULL TEXT

page 5

page 16

page 17

page 18

page 19

page 22

research
10/26/2021

Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias

The generalization mystery of overparametrized deep nets has motivated e...
research
07/15/2021

Adversarial Attack for Uncertainty Estimation: Identifying Critical Regions in Neural Networks

We propose a novel method to capture data points near decision boundary ...
research
06/13/2020

The Pitfalls of Simplicity Bias in Neural Networks

Several works have proposed Simplicity Bias (SB)—the tendency of standar...
research
10/14/2021

Towards Understanding the Data Dependency of Mixup-style Training

In the Mixup training paradigm, a model is trained using convex combinat...
research
02/19/2023

Stationary Point Losses for Robust Model

The inability to guarantee robustness is one of the major obstacles to t...
research
05/11/2019

Linear Range in Gradient Descent

This paper defines linear range as the range of parameter perturbations ...

Please sign up or login with your details

Forgot password? Click here to reset