Convergence of Distributed Stochastic Variance Reduced Methods without Sampling Extra Data

05/29/2019
by   Shicong Cen, et al.
0

Stochastic variance reduced methods have gained a lot of interest recently for empirical risk minimization due to its appealing run time complexity. When the data size is large and disjointly stored on different machines, it becomes imperative to distribute the implementation of such variance reduced methods. In this paper, we consider a general framework that directly distributes popular stochastic variance reduced methods, by assigning outer loops to the parameter server, and inner loops to worker machines. This framework is natural as it does not require sampling extra data and is friendly to implement, but its theoretical convergence is not well understood. We obtain a unified understanding of the convergence for algorithms under this framework by measuring the smoothness of the discrepancy between the local and global loss functions. We establish the linear convergence of distributed versions of a family of stochastic variance reduced algorithms, including those using accelerated and recursive gradient updates, for minimizing strongly convex losses. Our theory captures how the convergence of distributed algorithms behaves as the number of machines and the size of local data vary. Furthermore, we show that when the smoothness discrepancy between local and global loss functions is large, regularization can be used to ensure convergence. Our analysis can be further extended to handle nonsmooth and nonconvex loss functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/28/2022

Adaptive Accelerated (Extra-)Gradient Methods with Variance Reduction

In this paper, we study the finite-sum convex optimization problem focus...
research
03/30/2022

Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

The stochastic gradient Langevin Dynamics is one of the most fundamental...
research
06/20/2020

Unified Analysis of Stochastic Gradient Methods for Composite Convex and Smooth Optimization

We present a unified theorem for the convergence analysis of stochastic ...
research
02/23/2023

Unified Convergence Theory of Stochastic and Variance-Reduced Cubic Newton Methods

We study the widely known Cubic-Newton method in the stochastic setting ...
research
05/25/2018

A New Analysis of Variance Reduced Stochastic Proximal Methods for Composite Optimization with Serial and Asynchronous Realizations

We provide a comprehensive analysis of stochastic variance reduced gradi...
research
02/25/2020

Statistically Preconditioned Accelerated Gradient Method for Distributed Optimization

We consider the setting of distributed empirical risk minimization where...
research
01/24/2019

Don't Jump Through Hoops and Remove Those Loops: SVRG and Katyusha are Better Without the Outer Loop

The stochastic variance-reduced gradient method (SVRG) and its accelerat...

Please sign up or login with your details

Forgot password? Click here to reset