Distributed Stochastic Non-Convex Optimization: Momentum-Based Variance Reduction

05/01/2020
by   Prashant Khanduri, et al.
0

In this work, we propose a distributed algorithm for stochastic non-convex optimization. We consider a worker-server architecture where a set of K worker nodes (WNs) in collaboration with a server node (SN) jointly aim to minimize a global, potentially non-convex objective function. The objective function is assumed to be the sum of local objective functions available at each WN, with each node having access to only the stochastic samples of its local objective function. In contrast to the existing approaches, we employ a momentum based "single loop" distributed algorithm which eliminates the need of computing large batch size gradients to achieve variance reduction. We propose two algorithms one with "adaptive" and the other with "non-adaptive" learning rates. We show that the proposed algorithms achieve the optimal computational complexity while attaining linear speedup with the number of WNs. Specifically, the algorithms reach an ϵ-stationary point x_a with E∇ f(x_a) ≤Õ(K^-1/3T^-1/2 + K^-1/3T^-1/3) in T iterations, thereby requiring Õ(K^-1ϵ^-3) gradient computations at each WN. Moreover, our approach does not assume identical data distributions across WNs making the approach general enough for federated learning applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/12/2019

Parallel Restarted SPIDER – Communication Efficient Distributed Nonconvex Optimization with Optimal Computation Complexity

In this paper, we propose a distributed algorithm for stochastic smooth,...
research
12/10/2019

Byzantine Resilient Non-Convex SVRG with Distributed Batch Gradient Computations

In this work, we consider the distributed stochastic optimization proble...
research
05/24/2019

Momentum-Based Variance Reduction in Non-Convex SGD

Variance reduction has emerged in recent years as a strong competitor to...
research
06/22/2020

Non-convex Optimization via Adaptive Stochastic Search for End-to-End Learning and Control

In this work we propose the use of adaptive stochastic search as a build...
research
11/23/2019

A Stochastic Tensor Method for Non-convex Optimization

We present a stochastic optimization method that uses a fourth-order reg...
research
01/23/2020

Replica Exchange for Non-Convex Optimization

Gradient descent (GD) is known to converge quickly for convex objective ...
research
02/20/2023

A One-Sample Decentralized Proximal Algorithm for Non-Convex Stochastic Composite Optimization

We focus on decentralized stochastic non-convex optimization, where n ag...

Please sign up or login with your details

Forgot password? Click here to reset