Compression Implies Generalization

06/15/2021
by   Allan Grønlund, et al.
0

Explaining the surprising generalization performance of deep neural networks is an active and important line of research in theoretical machine learning. Influential work by Arora et al. (ICML'18) showed that, noise stability properties of deep nets occurring in practice can be used to provably compress model representations. They then argued that the small representations of compressed networks imply good generalization performance albeit only of the compressed nets. Extending their compression framework to yield generalization bounds for the original uncompressed networks remains elusive. Our main contribution is the establishment of a compression-based framework for proving generalization bounds. The framework is simple and powerful enough to extend the generalization bounds by Arora et al. to also hold for the original network. To demonstrate the flexibility of the framework, we also show that it allows us to give simple proofs of the strongest known generalization bounds for other popular machine learning models, namely Support Vector Machines and Boosting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/14/2018

Stronger generalization bounds for deep nets via a compression approach

Deep nets generalize well despite having more parameters than the number...
research
11/24/2022

PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization

While there has been progress in developing non-vacuous generalization b...
research
09/25/2019

Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network

One of biggest issues in deep learning theory is its generalization abil...
research
04/16/2018

Compressibility and Generalization in Large-Scale Deep Learning

Modern neural networks are highly overparameterized, with capacity to su...
research
11/10/2020

Margins are Insufficient for Explaining Gradient Boosting

Boosting is one of the most successful ideas in machine learning, achiev...
research
02/24/2021

On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)

It is generally recognized that finite learning rate (LR), in contrast t...
research
04/12/2021

Generalization bounds via distillation

This paper theoretically investigates the following empirical phenomenon...

Please sign up or login with your details

Forgot password? Click here to reset