Mixing of Stochastic Accelerated Gradient Descent

10/31/2019
by   Peiyuan Zhang, et al.
0

We study the mixing properties for stochastic accelerated gradient descent (SAGD) on least-squares regression. First, we show that stochastic gradient descent (SGD) and SAGD are simulating the same invariant distribution. Motivated by this, we then establish mixing rate for SAGD-iterates and compare it with those of SGD-iterates. Theoretically, we prove that the chain of SAGD iterates is geometrically ergodic –using a proper choice of parameters and under regularity assumptions on the input distribution. More specifically, we derive an explicit mixing rate depending on the first 4 moments of the data distribution. By means of illustrative examples, we prove that SAGD-iterate chain mixes faster than the chain of iterates obtained by SGD. Furthermore, we highlight applications of the established mixing rate in the convergence analysis of SAGD on realizable objectives. The proposed analysis is based on a novel non-asymptotic analysis of products of random matrices. This theoretical result is substantiated and validated by experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2020

Finite-Time Analysis of Stochastic Gradient Descent under Markov Randomness

Motivated by broad applications in reinforcement learning and machine le...
research
06/13/2022

Markov Chain Score Ascent: A Unifying Framework of Variational Inference with Markovian Gradients

Minimizing the inclusive Kullback-Leibler (KL) divergence with stochasti...
research
01/24/2019

Overcomplete Independent Component Analysis via SDP

We present a novel algorithm for overcomplete independent components ana...
research
09/15/2022

Efficiency Ordering of Stochastic Gradient Descent

We consider the stochastic gradient descent (SGD) algorithm driven by a ...
research
09/07/2018

An Anderson-Chebyshev Mixing Method for Nonlinear Optimization

Anderson mixing (or Anderson acceleration) is an efficient acceleration ...
research
06/07/2018

Scalable Natural Gradient Langevin Dynamics in Practice

Stochastic Gradient Langevin Dynamics (SGLD) is a sampling scheme for Ba...
research
07/27/2020

Stochastic Gradient Descent applied to Least Squares regularizes in Sobolev spaces

We study the behavior of stochastic gradient descent applied to Ax -b _2...

Please sign up or login with your details

Forgot password? Click here to reset