On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent

08/30/2022
by   Selina Drews, et al.
0

Estimation of a multivariate regression function from independent and identically distributed data is considered. An estimate is defined which fits a deep neural network consisting of a large number of fully connected neural networks, which are computed in parallel, via gradient descent to the data. The estimate is over-parametrized in the sense that the number of its parameters is much larger than the sample size. It is shown that in case of a suitable random initialization of the network, a suitable small stepsize of the gradient descent, and a number of gradient descent steps which is slightly larger than the reciprocal of the stepsize of the gradient descent, the estimate is universally consistent in the sense that its expected L2 error converges to zero for all distributions of the data where the response variable is square integrable.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/04/2022

Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descent

Estimation of a regression function from independent and identically dis...
research
12/09/2019

On the rate of convergence of a neural network regression estimate learned by gradient descent

Nonparametric regression with random design is considered. Estimates are...
research
04/10/2019

Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections

The behavior of the gradient descent (GD) algorithm is analyzed for a de...
research
10/16/2020

A case where a spindly two-layer linear network whips any neural network with a fully connected input layer

It was conjectured that any neural network of any structure and arbitrar...
research
03/01/2021

Deep Learning with a Classifier System: Initial Results

This article presents the first results from using a learning classifier...
research
05/24/2019

A Polynomial-Based Approach for Architectural Design and Learning with Deep Neural Networks

In this effort we propose a novel approach for reconstructing multivaria...
research
05/16/2019

Formal derivation of Mesh Neural Networks with their Forward-Only gradient Propagation

This paper proposes the Mesh Neural Network (MNN), a novel architecture ...

Please sign up or login with your details

Forgot password? Click here to reset