Max-Margin Works while Large Margin Fails: Generalization without Uniform Convergence

06/16/2022
by   Margalit Glasgow, et al.
0

A major challenge in modern machine learning is theoretically understanding the generalization properties of overparameterized models. Many existing tools rely on uniform convergence (UC), a property that, when it holds, guarantees that the test loss will be close to the training loss, uniformly over a class of candidate models. Nagarajan and Kolter (2019) show that in certain simple linear and neural-network settings, any uniform convergence bound will be vacuous, leaving open the question of how to prove generalization in settings where UC fails. Our main contribution is proving novel generalization bounds in two such settings, one linear, and one non-linear. We study the linear classification setting of Nagarajan and Kolter, and a quadratic ground truth function learned via a two-layer neural network in the non-linear regime. We prove a new type of margin bound showing that above a certain signal-to-noise threshold, any near-max-margin classifier will achieve almost no test loss in these two settings. Our results show that near-max-margin is important: while any model that achieves at least a (1 - ϵ)-fraction of the max-margin generalizes well, a classifier achieving half of the max-margin may fail terribly. We additionally strengthen the UC impossibility results of Nagarajan and Kolter, proving that one-sided UC bounds and classical margin bounds will fail on near-max-margin classifiers. Our analysis provides insight on why memorization can coexist with generalization: we show that in this challenging regime where generalization occurs but UC fails, near-max-margin classifiers simultaneously contain some generalizable components and some overfitting components that memorize the data. The presence of the overfitting components is enough to preclude UC, but the near-extremal margin guarantees that sufficient generalizable components are present.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2021

Max-Margin is Dead, Long Live Max-Margin!

The foundational concept of Max-Margin in machine learning is ill-posed ...
research
06/21/2021

Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation

The growing literature on "benign overfitting" in overparameterized mode...
research
10/21/2022

A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized Linear Models

We prove a new generalization bound that shows for any class of linear p...
research
10/08/2018

On Breiman's Dilemma in Neural Networks: Phase Transitions of Margin Dynamics

Margin enlargement over training data has been an important strategy sin...
research
06/09/2009

Large-Margin kNN Classification Using a Deep Encoder Network

KNN is one of the most popular classification methods, but it often fail...
research
10/09/2019

Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin

For linear classifiers, the relationship between (normalized) output mar...
research
01/20/2017

Stability Enhanced Large-Margin Classifier Selection

Stability is an important aspect of a classification procedure because u...

Please sign up or login with your details

Forgot password? Click here to reset