Uniform convergence may be unable to explain generalization in deep learning

02/13/2019
by   Vaishnavh Nagarajan, et al.
0

We cast doubt on the power of uniform convergence-based generalization bounds to provide a complete picture of why overparameterized deep networks generalize well. While it is well-known that many existing bounds are numerically large, through a variety of experiments, we first bring to light another crucial and more concerning aspect of these bounds: in practice, these bounds can increase with the dataset size. Guided by our observations, we then show how uniform convergence could provably break down even in a simple setup that preserves the key elements of deep learning: we present a noisy algorithm that learns a mildly overparameterized linear classifier such that uniform convergence cannot "explain generalization," even if we take into account implicit regularization to the fullest extent possible. More precisely, even if we consider only the set of classifiers output by the algorithm that have test errors less than some small ϵ, applying (two-sided) uniform convergence on this set of classifiers yields a generalization guarantee that is larger than 1-ϵ and is therefore nearly vacuous.

READ FULL TEXT
research
10/17/2021

Explaining generalization in deep learning: progress and fundamental limits

This dissertation studies a fundamental open challenge in deep learning ...
research
11/24/2022

PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization

While there has been progress in developing non-vacuous generalization b...
research
03/08/2021

Exact Gap between Generalization Error and Uniform Convergence in Random Feature Models

Recent work showed that there could be a large gap between the classical...
research
05/07/2021

Uniform Convergence, Adversarial Spheres and a Simple Remedy

Previous work has cast doubt on the general framework of uniform converg...
research
06/10/2020

On Uniform Convergence and Low-Norm Interpolation Learning

We consider an underdetermined noisy linear regression model where the m...
research
09/26/2013

Bennett-type Generalization Bounds: Large-deviation Case and Faster Rate of Convergence

In this paper, we present the Bennett-type generalization bounds of the ...
research
02/01/2021

Painless step size adaptation for SGD

Convergence and generalization are two crucial aspects of performance in...

Please sign up or login with your details

Forgot password? Click here to reset