Non Asymptotic Bounds for Optimization via Online Multiplicative Stochastic Gradient Descent

12/14/2021
by   Riddhiman Bhattacharya, et al.
0

The gradient noise of Stochastic Gradient Descent (SGD) is considered to play a key role in its properties (e.g. escaping low potential points and regularization). Past research has indicated that the covariance of the SGD error done via minibatching plays a critical role in determining its regularization and escape from low potential points. It is however not much explored how much the distribution of the error influences the behavior of the algorithm. Motivated by some new research in this area, we prove universality results by showing that noise classes that have the same mean and covariance structure of SGD via minibatching have similar properties. We mainly consider the Multiplicative Stochastic Gradient Descent (M-SGD) algorithm as introduced by Wu et al., which has a much more general noise class than the SGD algorithm done via minibatching. We establish nonasymptotic bounds for the M-SGD algorithm mainly with respect to the Stochastic Differential Equation corresponding to SGD via minibatching. We also show that the M-SGD error is approximately a scaled Gaussian distribution with mean 0 at any fixed point of the M-SGD algorithm. We also establish bounds for the convergence of the M-SGD algorithm in the strongly convex regime.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2019

The Multiplicative Noise in Stochastic Gradient Descent: Data-Dependent Regularization, Continuous and Discrete Approximation

The randomness in Stochastic Gradient Descent (SGD) is considered to pla...
research
01/18/2019

Quasi-potential as an implicit regularizer for the loss function in the stochastic gradient descent

We interpret the variational inference of the Stochastic Gradient Descen...
research
04/08/2020

Continuous and Discrete-Time Analysis of Stochastic Gradient Descent for Convex and Non-Convex Functions

This paper proposes a thorough theoretical analysis of Stochastic Gradie...
research
01/04/2023

On the Convergence of Stochastic Gradient Descent in Low-precision Number Formats

Deep learning models are dominating almost all artificial intelligence t...
research
04/03/2019

A Stochastic Interpretation of Stochastic Mirror Descent: Risk-Sensitive Optimality

Stochastic mirror descent (SMD) is a fairly new family of algorithms tha...
research
09/15/2022

Efficiency Ordering of Stochastic Gradient Descent

We consider the stochastic gradient descent (SGD) algorithm driven by a ...
research
10/27/2022

Stochastic Mirror Descent in Average Ensemble Models

The stochastic mirror descent (SMD) algorithm is a general class of trai...

Please sign up or login with your details

Forgot password? Click here to reset