Distributed Learning in Non-Convex Environments – Part I: Agreement at a Linear Rate

07/03/2019
by   Stefan Vlaski, et al.
0

Driven by the need to solve increasingly complex optimization problems in signal processing and machine learning, there has been increasing interest in understanding the behavior of gradient-descent algorithms in non-convex environments. Most available works on distributed non-convex optimization problems focus on the deterministic setting where exact gradients are available at each agent. In this work and its Part II, we consider stochastic cost functions, where exact gradients are replaced by stochastic approximations and the resulting gradient noise persistently seeps into the dynamics of the algorithm. We establish that the diffusion learning strategy continues to yield meaningful estimates non-convex scenarios in the sense that the iterates by the individual agents will cluster in a small region around the network centroid. We use this insight to motivate a short-term model for network evolution over a finite-horizon. In Part II [2] of this work, we leverage this model to establish descent of the diffusion strategy through saddle points in O(1/μ) steps and the return of approximately second-order stationary points in a polynomial number of iterations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/03/2019

Distributed Learning in Non-Convex Environments – Part II: Polynomial Escape from Saddle-Points

The diffusion strategy for distributed learning from streaming data empl...
research
10/03/2019

Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent

Gradient descent and its variants are widely used in machine learning. H...
research
08/16/2021

A diffusion-map-based algorithm for gradient computation on manifolds and applications

We recover the gradient of a given function defined on interior points o...
research
08/19/2019

Second-Order Guarantees of Stochastic Gradient Descent in Non-Convex Optimization

Recent years have seen increased interest in performance guarantees of g...
research
10/29/2019

Efficiently avoiding saddle points with zero order methods: No gradients required

We consider the case of derivative-free algorithms for non-convex optimi...
research
06/26/2020

Understanding Notions of Stationarity in Non-Smooth Optimization

Many contemporary applications in signal processing and machine learning...
research
03/15/2018

Escaping Saddles with Stochastic Gradients

We analyze the variance of stochastic gradients along negative curvature...

Please sign up or login with your details

Forgot password? Click here to reset