Decentralized Stochastic Gradient Tracking for Non-convex Empirical Risk Minimization

09/06/2019
by   Jiaqi Zhang, et al.
0

This paper studies a decentralized stochastic gradient tracking (DSGT) algorithm for a non-convex empirical risk minimization problem over a peer-to-peer network, which is in sharp contrast to the existing DSGT works only for the convex problem. To handle the variance among decentralized datasets, the mini-batch in each node of the network is designed to be proportional to the size of its local dataset. We explicitly evaluate the convergence rate of DSGT in terms of algebraic connectivity of the network, mini-batch size, and learning rate. Importantly, our theoretical rate has an optimal dependence on the algebraic connectivity and can exactly recover the rate of the centralized stochastic gradient method. Moreover, we demonstrate that DSGT could achieve a linear speedup while a sublinear speedup is also possible, depending on the problem at hand. Numerical experiments for neural networks and logistic regression problems on CIFAR-10 finally illustrate the advantages of DSGT for decentralized training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2019

Decentralized Stochastic Gradient Tracking for Empirical Risk Minimization

Recent works have shown superiorities of decentralized SGD to centralize...
research
07/09/2019

A Stochastic First-Order Method for Ordered Empirical Risk Minimization

We propose a new stochastic first-order method for empirical risk minimi...
research
08/10/2020

An improved convergence analysis for decentralized online stochastic non-convex optimization

In this paper, we study decentralized online stochastic non-convex optim...
research
02/12/2021

A hybrid variance-reduced method for decentralized stochastic non-convex optimization

This paper considers decentralized stochastic optimization over a networ...
research
07/16/2017

Normalized Gradient with Adaptive Stepsize Method for Deep Neural Network Training

In this paper, we propose a generic and simple algorithmic framework for...
research
04/23/2023

Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

Nowadays, algorithms with fast convergence, small memory footprints, and...
research
03/03/2021

Critical Parameters for Scalable Distributed Learning with Large Batches and Asynchronous Updates

It has been experimentally observed that the efficiency of distributed t...

Please sign up or login with your details

Forgot password? Click here to reset