Computing the Variance of Shuffling Stochastic Gradient Algorithms via Power Spectral Density Analysis

06/01/2022
by   Carles Domingo Enrich, et al.
0

When solving finite-sum minimization problems, two common alternatives to stochastic gradient descent (SGD) with theoretical benefits are random reshuffling (SGD-RR) and shuffle-once (SGD-SO), in which functions are sampled in cycles without replacement. Under a convenient stochastic noise approximation which holds experimentally, we study the stationary variances of the iterates of SGD, SGD-RR and SGD-SO, whose leading terms decrease in this order, and obtain simple approximations. To obtain our results, we study the power spectral density of the stochastic gradient noise sequences. Our analysis extends beyond SGD to SGD with momentum and to the stochastic Nesterov's accelerated gradient method. We perform experiments on quadratic objective functions to test the validity of our approximation and the correctness of our findings.

READ FULL TEXT

page 10

page 41

research
11/05/2018

Stochastic Modified Equations and Dynamics of Stochastic Gradient Algorithms I: Mathematical Foundations

We develop the mathematical foundations of the stochastic modified equat...
research
11/20/2019

Bayesian interpretation of SGD as Ito process

The current interpretation of stochastic gradient descent (SGD) as a sto...
research
05/20/2022

On the SDEs and Scaling Rules for Adaptive Gradient Algorithms

Approximating Stochastic Gradient Descent (SGD) as a Stochastic Differen...
research
03/01/2023

D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory

Kohn-Sham Density Functional Theory (KS-DFT) has been traditionally solv...
research
10/25/2019

Bias-Variance Tradeoff in a Sliding Window Implementation of the Stochastic Gradient Algorithm

This paper provides a framework to analyze stochastic gradient algorithm...
research
09/29/2022

NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers

Classical machine learning models such as deep neural networks are usual...
research
12/18/2017

On the Relationship Between the OpenAI Evolution Strategy and Stochastic Gradient Descent

Because stochastic gradient descent (SGD) has shown promise optimizing n...

Please sign up or login with your details

Forgot password? Click here to reset