Almost Sure Convergence of Dropout Algorithms for Neural Networks

02/06/2020
by   Albert Senen-Cerda, et al.
0

We investigate the convergence and convergence rate of stochastic training algorithms for Neural Networks (NNs) that, over the years, have spawned from Dropout (Hinton et al., 2012). Modeling that neurons in the brain may not fire, dropout algorithms consist in practice of multiplying the weight matrices of a NN component-wise by independently drawn random matrices with {0,1}-valued entries during each iteration of the Feedforward-Backpropagation algorithm. This paper presents a probability theoretical proof that for any NN topology and differentiable polynomially bounded activation functions, if we project the NN's weights into a compact set and use a dropout algorithm, then the weights converge to a unique stationary set of a projected system of Ordinary Differential Equations (ODEs). We also establish an upper bound on the rate of convergence of Gradient Descent (GD) on the limiting ODEs of dropout algorithms for arborescences (a class of trees) of arbitrary depth and with linear activation functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/01/2020

Asymptotic convergence rate of Dropout on shallow linear neural networks

We analyze the convergence rate of gradient flows on objective functions...
research
08/23/2016

Neural Networks with Smooth Adaptive Activation Functions for Regression

In Neural Networks (NN), Adaptive Activation Functions (AAF) have parame...
research
05/19/2019

Learning Compact Neural Networks Using Ordinary Differential Equations as Activation Functions

Most deep neural networks use simple, fixed activation functions, such a...
research
09/29/2012

Self-Delimiting Neural Networks

Self-delimiting (SLIM) programs are a central concept of theoretical com...
research
06/10/2020

Scalable Partial Explainability in Neural Networks via Flexible Activation Functions

Achieving transparency in black-box deep learning algorithms is still an...
research
02/08/2020

Extrapolation Towards Imaginary 0-Nearest Neighbour and Its Improved Convergence Rate

k-nearest neighbour (k-NN) is one of the simplest and most widely-used m...
research
12/08/2016

Learning in the Machine: Random Backpropagation and the Learning Channel

Random backpropagation (RBP) is a variant of the backpropagation algorit...

Please sign up or login with your details

Forgot password? Click here to reset