What training reveals about neural network complexity

06/08/2021
by   Andreas Loukas, et al.
0

This work explores the hypothesis that the complexity of the function a deep neural network (NN) is learning can be deduced by how fast its weights change during training. Our analysis provides evidence for this supposition by relating the network's distribution of Lipschitz constants (i.e., the norm of the gradient at different regions of the input space) during different training intervals with the behavior of the stochastic training procedure. We first observe that the average Lipschitz constant close to the training data affects various aspects of the parameter trajectory, with more complex networks having a longer trajectory, bigger variance, and often veering further from their initialization. We then show that NNs whose biases are trained more steadily have bounded complexity even in regions of the input space that are far from any training point. Finally, we find that steady training with Dropout implies a training- and data-dependent generalization bound that grows poly-logarithmically with the number of parameters. Overall, our results support the hypothesis that good training behavior can be a useful bias towards good generalization.

READ FULL TEXT

page 15

page 16

research
06/09/2022

Trajectory-dependent Generalization Bounds for Deep Neural Networks via Fractional Brownian Motion

Despite being tremendously overparameterized, it is appreciated that dee...
research
01/28/2023

On the Lipschitz Constant of Deep Networks and Double Descent

Existing bounds on the generalization error of deep networks assume some...
research
05/21/2018

Measuring and regularizing networks in function space

Neural network optimization is often conceptualized as optimizing parame...
research
01/31/2019

Effect of Various Regularizers on Model Complexities of Neural Networks in Presence of Input Noise

Deep neural networks are over-parameterized, which implies that the numb...
research
12/07/2020

Why Unsupervised Deep Networks Generalize

Promising resolutions of the generalization puzzle observe that the actu...
research
04/21/2021

MLDS: A Dataset for Weight-Space Analysis of Neural Networks

Neural networks are powerful models that solve a variety of complex real...
research
03/02/2019

High-Dimensional Learning under Approximate Sparsity: A Unifying Framework for Nonsmooth Learning and Regularized Neural Networks

High-dimensional statistical learning (HDSL) has been widely applied in ...

Please sign up or login with your details

Forgot password? Click here to reset