A convergence analysis of the perturbed compositional gradient flow: averaging principle and normal deviations

09/02/2017
by   Wenqing Hu, et al.
0

We consider in this work a system of two stochastic differential equations named the perturbed compositional gradient flow. By introducing a separation of fast and slow scales of the two equations, we show that the limit of the slow motion is given by an averaged ordinary differential equation. We then demonstrate that the deviation of the slow motion from the averaged equation, after proper rescaling, converges to a stochastic process with Gaussian inputs. This indicates that the slow motion can be approximated in the weak sense by a standard perturbed gradient flow or the continuous-time stochastic gradient descent algorithm that solves the optimization problem for a composition of two functions. As an application, the perturbed compositional gradient flow corresponds to the diffusion limit of the Stochastic Composite Gradient Descent (SCGD) algorithm for minimizing a composition of two expected-value functions in the optimization literatures. For the strongly convex case, such an analysis implies that the SCGD algorithm has the same convergence time asymptotic as the classical stochastic gradient descent algorithm. Thus it validates the effectiveness of using the SCGD algorithm in the strongly convex case.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2017

Stochastic Gradient Descent in Continuous Time: A Central Limit Theorem

Stochastic gradient descent in continuous time (SGDCT) provides a comput...
research
11/14/2014

Stochastic Compositional Gradient Descent: Algorithms for Minimizing Compositions of Expected-Value Functions

Classical stochastic gradient methods are well suited for minimizing exp...
research
04/15/2020

Analysis of Stochastic Gradient Descent in Continuous Time

Stochastic gradient descent is an optimisation method that combines clas...
research
05/09/2020

Perturbed gradient descent with occupation time

This paper develops further the idea of perturbed gradient descent, by a...
research
02/07/2018

Improved Oracle Complexity of Variance Reduced Methods for Nonsmooth Convex Stochastic Composition Optimization

We consider the nonsmooth convex composition optimization problem where ...
research
02/07/2018

Improved Incremental First-Order Oracle Complexity of Variance Reduced Methods for Nonsmooth Convex Stochastic Composition Optimization

We consider the nonsmooth convex composition optimization problem where ...
research
02/02/2019

Uniform-in-Time Weak Error Analysis for Stochastic Gradient Descent Algorithms via Diffusion Approximation

Diffusion approximation provides weak approximation for stochastic gradi...

Please sign up or login with your details

Forgot password? Click here to reset