PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization

11/24/2022
by   Sanae Lotfi, et al.
0

While there has been progress in developing non-vacuous generalization bounds for deep neural networks, these bounds tend to be uninformative about why deep learning works. In this paper, we develop a compression approach based on quantizing neural network parameters in a linear subspace, profoundly improving on previous results to provide state-of-the-art generalization bounds on a variety of tasks, including transfer learning. We use these tight bounds to better understand the role of model size, equivariance, and the implicit biases of optimization, for generalization in deep learning. Notably, we find large models can be compressed to a much greater extent than previously known, encapsulating Occam's razor. We also argue for data-independent bounds in explaining generalization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/08/2022

On Rademacher Complexity-based Generalization Bounds for Deep Learning

In this paper, we develop some novel bounds for the Rademacher complexit...
research
09/25/2019

Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network

One of biggest issues in deep learning theory is its generalization abil...
research
06/15/2021

Compression Implies Generalization

Explaining the surprising generalization performance of deep neural netw...
research
04/15/2018

Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds

The deployment of state-of-the-art neural networks containing millions o...
research
04/16/2018

Compressibility and Generalization in Large-Scale Deep Learning

Modern neural networks are highly overparameterized, with capacity to su...
research
02/13/2019

Uniform convergence may be unable to explain generalization in deep learning

We cast doubt on the power of uniform convergence-based generalization b...
research
02/20/2018

Do Deep Learning Models Have Too Many Parameters? An Information Theory Viewpoint

Deep learning models often have more parameters than observations, and s...

Please sign up or login with your details

Forgot password? Click here to reset