Convergence of stochastic gradient descent schemes for Lojasiewicz-landscapes

02/16/2021
by   Steffen Dereich, et al.
0

In this article, we consider convergence of stochastic gradient descent schemes (SGD) under weak assumptions on the underlying landscape. More explicitly, we show that on the event that the SGD stays local we have convergence of the SGD if there is only a countable number of critical points or if the target function/landscape satisfies Lojasiewicz-inequalities around all critical levels as all analytic functions do. In particular, we show that for neural networks with analytic activation function such as softplus, sigmoid and the hyperbolic tangent, SGD converges on the event of staying local, if the random variables modeling the signal and response in the training are compactly supported.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/26/2021

The convergence of the Stochastic Gradient Descent (SGD) : a self-contained proof

We give here a proof of the convergence of the Stochastic Gradient Desce...
research
11/10/2021

SGD Through the Lens of Kolmogorov Complexity

We prove that stochastic gradient descent (SGD) finds a solution that ac...
research
03/21/2022

ImageNet Challenging Classification with the Raspberry Pi: An Incremental Local Stochastic Gradient Descent Algorithm

With rising powerful, low-cost embedded devices, the edge computing has ...
research
09/06/2021

Stochastic Subgradient Descent on a Generic Definable Function Converges to a Minimizer

It was previously shown by Davis and Drusvyatskiy that every Clarke crit...
research
07/21/2023

Convergence of SGD for Training Neural Networks with Sliced Wasserstein Losses

Optimal Transport has sparked vivid interest in recent years, in particu...
research
03/23/2020

A classification for the performance of online SGD for high-dimensional inference

Stochastic gradient descent (SGD) is a popular algorithm for optimizatio...
research
07/03/2020

Weak error analysis for stochastic gradient descent optimization algorithms

Stochastic gradient descent (SGD) type optimization schemes are fundamen...

Please sign up or login with your details

Forgot password? Click here to reset