Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation

03/03/2020
by   Arnulf Jentzen, et al.
0

In spite of the accomplishments of deep learning based algorithms in numerous applications and very broad corresponding research interest, at the moment there is still no rigorous understanding of the reasons why such algorithms produce useful results in certain situations. A thorough mathematical analysis of deep learning based algorithms seems to be crucial in order to improve our understanding and to make their implementation more effective and efficient. In this article we provide a mathematically rigorous full error analysis of deep learning based empirical risk minimisation with quadratic loss function in the probabilistically strong sense, where the underlying deep neural networks are trained using stochastic gradient descent with random initialisation. The convergence speed we obtain is presumably far from optimal and suffers under the curse of dimensionality. To the best of our knowledge, we establish, however, the first full error analysis in the scientific literature for a deep learning based algorithm in the probabilistically strong sense and, moreover, the first full error analysis in the scientific literature for a deep learning based algorithm where stochastic gradient descent with random initialisation is the employed optimisation method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2020

Non-convergence of stochastic gradient descent in the training of deep neural networks

Deep neural networks have successfully been trained in various applicati...
research
09/30/2019

Full error analysis for the training of deep neural networks

Deep learning algorithms have been applied very successfully in recent y...
research
12/15/2020

Strong overall error analysis for the training of artificial neural networks via random initializations

Although deep learning based approximation algorithms have been applied ...
research
02/28/2019

A block-random algorithm for learning on distributed, heterogeneous data

Most deep learning models are based on deep neural networks with multipl...
research
01/21/2021

Invariance, encodings, and generalization: learning identity effects with neural networks

Often in language and other areas of cognition, whether two components o...
research
10/24/2021

A deep learning based surrogate model for stochastic simulators

We propose a deep learning-based surrogate model for stochastic simulato...
research
07/03/2020

Weak error analysis for stochastic gradient descent optimization algorithms

Stochastic gradient descent (SGD) type optimization schemes are fundamen...

Please sign up or login with your details

Forgot password? Click here to reset