How implicit regularization of Neural Networks affects the learned function – Part I

11/07/2019
by   Jakob Heiss, et al.
0

Today, various forms of neural networks are trained to perform approximation tasks in many fields. However, the solutions obtained are not wholly understood. Empirical results suggest that the training favors regularized solutions. These observations motivate us to analyze properties of the solutions found by the gradient descent algorithm frequently employed to perform the training task. As a starting point, we consider one dimensional (shallow) neural networks in which weights are chosen randomly and only the terminal layer is trained. We show, that the resulting solution converges to the smooth spline interpolation of the training data as the number of hidden nodes tends to infinity. This might give valuable insight on the properties of the solutions obtained using gradient descent methods in general settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2021

The Dynamics of Gradient Descent for Overparametrized Neural Networks

We consider the dynamics of gradient descent (GD) in overparameterized s...
research
04/14/2022

Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

We prove linear convergence of gradient descent to a global minimum for ...
research
11/30/2021

The Geometric Occam's Razor Implicit in Deep Learning

In over-parameterized deep neural networks there can be many possible pa...
research
12/09/2019

On the rate of convergence of a neural network regression estimate learned by gradient descent

Nonparametric regression with random design is considered. Estimates are...
research
07/14/2019

Learning Neural Networks with Adaptive Regularization

Feed-forward neural networks can be understood as a combination of an in...
research
02/22/2023

Regularised neural networks mimic human insight

Humans sometimes show sudden improvements in task performance that have ...
research
06/12/2020

Implicit bias of gradient descent for mean squared error regression with wide neural networks

We investigate gradient descent training of wide neural networks and the...

Please sign up or login with your details

Forgot password? Click here to reset