Convergence rates for gradient descent in the training of overparameterized artificial neural networks with biases

02/23/2021
by   Arnulf Jentzen, et al.
0

In recent years, artificial neural networks have developed into a powerful tool for dealing with a multitude of problems for which classical solution approaches reach their limits. However, it is still unclear why randomly initialized gradient descent optimization algorithms, such as the well-known batch gradient descent, are able to achieve zero training loss in many situations even though the objective function is non-convex and non-smooth. One of the most promising approaches to solving this problem in the field of supervised learning is the analysis of gradient descent optimization in the so-called overparameterized regime. In this article we provide a further contribution to this area of research by considering overparameterized fully-connected rectified artificial neural networks with biases. Specifically, we show that for a fixed number of training data the mean squared error using batch gradient descent optimization applied to such a randomly initialized artificial neural network converges to zero at a linear convergence rate as long as the width of the artificial neural network is large enough, the learning rate is small enough, and the training input data are pairwise linearly independent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2021

A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions

Gradient descent optimization algorithms are the standard ingredients th...
research
01/07/2021

A Comprehensive Study on Optimization Strategies for Gradient Descent In Deep Learning

One of the most important parts of Artificial Neural Networks is minimiz...
research
06/21/2019

Adaptive Learning Rate Clipping Stabilizes Learning

Artificial neural network training with stochastic gradient descent can ...
research
11/26/2019

Emergent Structures and Lifetime Structure Evolution in Artificial Neural Networks

Motivated by the flexibility of biological neural networks whose connect...
research
07/17/2016

Piecewise convexity of artificial neural networks

Although artificial neural networks have shown great promise in applicat...
research
01/27/2023

Interpreting learning in biological neural networks as zero-order optimization method

Recently, significant progress has been made regarding the statistical u...
research
02/10/2020

Pairwise Neural Networks (PairNets) with Low Memory for Fast On-Device Applications

A traditional artificial neural network (ANN) is normally trained slowly...

Please sign up or login with your details

Forgot password? Click here to reset