Predicting Out-of-Domain Generalization with Local Manifold Smoothness

07/05/2022
by   Nathan Ng, et al.
0

Understanding how machine learning models generalize to new environments is a critical part of their safe deployment. Recent work has proposed a variety of complexity measures that directly predict or theoretically bound the generalization capacity of a model. However, these methods rely on a strong set of assumptions that in practice are not always satisfied. Motivated by the limited settings in which existing measures can be applied, we propose a novel complexity measure based on the local manifold smoothness of a classifier. We define local manifold smoothness as a classifier's output sensitivity to perturbations in the manifold neighborhood around a given test point. Intuitively, a classifier that is less sensitive to these perturbations should generalize better. To estimate smoothness we sample points using data augmentation and measure the fraction of these points classified into the majority class. Our method only requires selecting a data augmentation method and makes no other assumptions about the model or data distributions, meaning it can be applied even in out-of-domain (OOD) settings where existing methods cannot. In experiments on robustness benchmarks in image classification, sentiment analysis, and natural language inference, we demonstrate a strong and robust correlation between our manifold smoothness measure and actual OOD generalization on over 3,000 models evaluated on over 100 train/test domain pairs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2020

SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness

Models that perform well on a training domain often fail to generalize t...
research
02/23/2018

Sensitivity and Generalization in Neural Networks: an Empirical Study

In practice it is often found that large over-parameterized neural netwo...
research
08/18/2021

Semantic Perturbations with Normalizing Flows for Improved Generalization

Data augmentation is a widely adopted technique for avoiding overfitting...
research
03/03/2019

Classification via local manifold approximation

Classifiers label data as belonging to one of a set of groups based on i...
research
12/04/2020

Kernel-convoluted Deep Neural Networks with Data Augmentation

The Mixup method (Zhang et al. 2018), which uses linearly interpolated d...
research
06/12/2023

Underwater Acoustic Target Recognition based on Smoothness-inducing Regularization and Spectrogram-based Data Augmentation

Underwater acoustic target recognition is a challenging task owing to th...
research
06/19/2023

Understanding Generalization in the Interpolation Regime using the Rate Function

In this paper, we present a novel characterization of the smoothness of ...

Please sign up or login with your details

Forgot password? Click here to reset