The Geometric Occam's Razor Implicit in Deep Learning

11/30/2021
by   Benoit Dherin, et al.
0

In over-parameterized deep neural networks there can be many possible parameter configurations that fit the training data exactly. However, the properties of these interpolating solutions are poorly understood. We argue that over-parameterized neural networks trained with stochastic gradient descent are subject to a Geometric Occam's Razor; that is, these networks are implicitly regularized by the geometric model complexity. For one-dimensional regression, the geometric model complexity is simply given by the arc length of the function. For higher-dimensional settings, the geometric model complexity depends on the Dirichlet energy of the function. We explore the relationship between this Geometric Occam's Razor, the Dirichlet energy and other known forms of implicit regularization. Finally, for ResNets trained on CIFAR-10, we observe that Dirichlet energy measurements are consistent with the action of this implicit Geometric Occam's Razor.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/27/2022

Why neural networks find simple solutions: the many regularizers of geometric complexity

In many contexts, simpler models are preferable to more complex models a...
research
11/07/2019

How implicit regularization of Neural Networks affects the learned function – Part I

Today, various forms of neural networks are trained to perform approxima...
research
03/05/2019

Implicit Regularization in Over-parameterized Neural Networks

Over-parameterized neural networks generalize well in practice without a...
research
04/19/2019

Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process

We consider deep networks, trained via stochastic gradient descent to mi...
research
08/07/2023

Implicit Graph Neural Diffusion Based on Constrained Dirichlet Energy Minimization

Implicit graph neural networks (GNNs) have emerged as a potential approa...
research
11/21/2022

Implicit Training of Energy Model for Structure Prediction

Most deep learning research has focused on developing new model and trai...
research
11/03/2020

Geometry Perspective Of Estimating Learning Capability Of Neural Networks

The paper uses statistical and differential geometric motivation to acqu...

Please sign up or login with your details

Forgot password? Click here to reset