Stochastic Runge-Kutta methods and adaptive SGD-G2 stochastic gradient descent

02/20/2020
by   Imen Ayadi, et al.
0

The minimization of the loss function is of paramount importance in deep neural networks. On the other hand, many popular optimization algorithms have been shown to correspond to some evolution equation of gradient flow type. Inspired by the numerical schemes used for general evolution equations we introduce a second order stochastic Runge Kutta method and show that it yields a consistent procedure for the minimization of the loss function. In addition it can be coupled, in an adaptive framework, with a Stochastic Gradient Descent (SGD) to adjust automatically the learning rate of the SGD, without the need of any additional information on the Hessian of the loss functional. The adaptive SGD, called SGD-G2, is successfully tested on standard datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2021

Research of Damped Newton Stochastic Gradient Descent Method for Neural Network Training

First-order methods like stochastic gradient descent(SGD) are recently t...
research
05/24/2023

Local SGD Accelerates Convergence by Exploiting Second Order Information of the Loss Function

With multiple iterations of updates, local statistical gradient descent ...
research
01/21/2023

Genetically Modified Wolf Optimization with Stochastic Gradient Descent for Optimising Deep Neural Networks

When training Convolutional Neural Networks (CNNs) there is a large emph...
research
09/07/2021

Revisiting Recursive Least Squares for Training Deep Neural Networks

Recursive least squares (RLS) algorithms were once widely used for train...
research
05/20/2022

On the SDEs and Scaling Rules for Adaptive Gradient Algorithms

Approximating Stochastic Gradient Descent (SGD) as a Stochastic Differen...
research
11/27/2018

Reliable uncertainty estimate for antibiotic resistance classification with Stochastic Gradient Langevin Dynamics

Antibiotic resistance monitoring is of paramount importance in the face ...
research
08/13/2023

Law of Balance and Stationary Distribution of Stochastic Gradient Descent

The stochastic gradient descent (SGD) algorithm is the algorithm we use ...

Please sign up or login with your details

Forgot password? Click here to reset