A continuous-time analysis of distributed stochastic gradient

12/28/2018
by   Nicholas M. Boffi, et al.
0

Synchronization in distributed networks of nonlinear dynamical systems plays a critical role in improving robustness of the individual systems to independent stochastic perturbations. Through analogy with dynamical models of biological quorum sensing, where synchronization between systems is induced through interaction with a common signal, we analyze the effect of synchronization on distributed stochastic gradient algorithms. We demonstrate that synchronization can significantly reduce the magnitude of the noise felt by the individual distributed agents and by their spatial mean. This noise reduction property is connected with a reduction in smoothing of the loss function imposed by the stochastic gradient approximation. Using similar techniques, we provide a convergence analysis, and derive a bound on the expected deviation of the spatial mean of the agents from the global minimizer of a strictly convex function. By considering additional dynamics for the quorum variable, we derive an analogous bound, and obtain new convergence results for the elastic averaging SGD algorithm. We conclude with a local analysis around a minimum of a nonconvex loss function, and show that the distributed setting leads to lower expected loss values and wider minima.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2019

Gradient Noise Convolution (GNC): Smoothing Loss Function for Distributed Large-Batch SGD

Large-batch stochastic gradient descent (SGD) is widely used for trainin...
research
10/21/2010

Synchronization and Redundancy: Implications for Robustness of Neural Learning and Decision Making

Learning and decision making in the brain are key processes critical to ...
research
01/18/2019

Quasi-potential as an implicit regularizer for the loss function in the stochastic gradient descent

We interpret the variational inference of the Stochastic Gradient Descen...
research
05/03/2019

Performance Optimization on Model Synchronization in Parallel Stochastic Gradient Descent Based SVM

Understanding the bottlenecks in implementing stochastic gradient descen...
research
02/12/2021

Stochastic Gradient Langevin Dynamics with Variance Reduction

Stochastic gradient Langevin dynamics (SGLD) has gained the attention of...
research
05/20/2021

Logarithmic landscape and power-law escape rate of SGD

Stochastic gradient descent (SGD) undergoes complicated multiplicative n...
research
04/07/2019

Global Synchronization of Clocks in Directed Rooted Acyclic Graphs: A Hybrid Systems Approach

In this paper, we study the problem of robust global synchronization of ...

Please sign up or login with your details

Forgot password? Click here to reset