Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes

08/05/2019
by   Kenji Kawaguchi, et al.
4

In this paper, we theoretically prove that gradient descent can find a global minimum for nonlinear deep neural networks of sizes commonly encountered in practice. The theory developed in this paper requires only the number of trainable parameters to increase linearly as the number of training samples increases. This allows the size of the deep neural networks to be several orders of magnitude smaller than that required by the previous theories. Moreover, we prove that the linear increase of the size of the network is the optimal rate and that it cannot be improved, except by a logarithmic factor. Furthermore, deep neural networks with the trainability guarantee are shown to generalize well to unseen test samples with a natural dataset but not a random dataset.

READ FULL TEXT

page 1

page 2

page 3

page 5

page 7

page 8

research
03/30/2022

Convergence of gradient descent for deep neural networks

Optimization by gradient descent has been one of main drivers of the "de...
research
11/09/2018

Gradient Descent Finds Global Minima of Deep Neural Networks

Gradient descent finds a global minimum in training deep neural networks...
research
12/14/2017

Nonparametric Neural Networks

Automatically determining the optimal size of a neural network for a giv...
research
03/14/2021

Pre-interpolation loss behaviour in neural networks

When training neural networks as classifiers, it is common to observe an...
research
08/18/2015

Scalable Out-of-Sample Extension of Graph Embeddings Using Deep Neural Networks

Several popular graph embedding techniques for representation learning a...
research
05/22/2023

Deep Neural Collapse Is Provably Optimal for the Deep Unconstrained Features Model

Neural collapse (NC) refers to the surprising structure of the last laye...
research
07/07/2022

A Solver + Gradient Descent Training Algorithm for Deep Neural Networks

We present a novel hybrid algorithm for training Deep Neural Networks th...

Please sign up or login with your details

Forgot password? Click here to reset