Rethinking Parameter Counting in Deep Models: Effective Dimensionality Revisited

03/04/2020
by   Wesley J. Maddox, et al.
14

Neural networks appear to have mysterious generalization properties when using parameter counting as a proxy for complexity. Indeed, neural networks often have many more parameters than there are data points, yet still provide good generalization performance. Moreover, when we measure generalization as a function of parameters, we see double descent behaviour, where the test error decreases, increases, and then again decreases. We show that many of these properties become understandable when viewed through the lens of effective dimensionality, which measures the dimensionality of the parameter space determined by the data. We relate effective dimensionality to posterior contraction in Bayesian deep learning, model selection, double descent, and functional diversity in loss surfaces, leading to a richer understanding of the interplay between parameters and functions in deep models.

READ FULL TEXT

page 2

page 7

page 15

research
03/07/2022

Generalization Through The Lens Of Leave-One-Out Error

Despite the tremendous empirical success of deep learning models to solv...
research
12/04/2019

Deep Double Descent: Where Bigger Models and More Data Hurt

We show that a variety of modern deep learning tasks exhibit a "double-d...
research
08/26/2021

When and how epochwise double descent happens

Deep neural networks are known to exhibit a `double descent' behavior as...
research
05/27/2023

Learning Capacity: A Measure of the Effective Dimensionality of a Model

We exploit a formal correspondence between thermodynamics and inference,...
research
11/09/2020

Numerical Exploration of Training Loss Level-Sets in Deep Neural Networks

We present a computational method for empirically characterizing the tra...
research
03/14/2022

Phenomenology of Double Descent in Finite-Width Neural Networks

`Double descent' delineates the generalization behaviour of models depen...
research
03/06/2021

An Effective Approach to Minimize Error in Midpoint Ellipse Drawing Algorithm

The present paper deals with the generalization of Midpoint Ellipse Draw...

Please sign up or login with your details

Forgot password? Click here to reset