Decentralized Stochastic Gradient Tracking for Non-convex Empirical Risk Minimization

09/06/2019

∙

This paper studies a decentralized stochastic gradient tracking (DSGT) algorithm for a non-convex empirical risk minimization problem over a peer-to-peer network, which is in sharp contrast to the existing DSGT works only for the convex problem. To handle the variance among decentralized datasets, the mini-batch in each node of the network is designed to be proportional to the size of its local dataset. We explicitly evaluate the convergence rate of DSGT in terms of algebraic connectivity of the network, mini-batch size, and learning rate. Importantly, our theoretical rate has an optimal dependence on the algebraic connectivity and can exactly recover the rate of the centralized stochastic gradient method. Moreover, we demonstrate that DSGT could achieve a linear speedup while a sublinear speedup is also possible, depending on the problem at hand. Numerical experiments for neural networks and logistic regression problems on CIFAR-10 finally illustrate the advantages of DSGT for decentralized training.

READ FULL TEXT

Decentralized Stochastic Gradient Tracking for Non-convex Empirical Risk Minimization

Sign in with Google

Consider DeepAI Pro