Intermittent Pulling with Local Compensation for Communication-Efficient Federated Learning

01/22/2020
by   Haozhao Wang, et al.
0

Federated Learning is a powerful machine learning paradigm to cooperatively train a global model with highly distributed data. A major bottleneck on the performance of distributed Stochastic Gradient Descent (SGD) algorithm for large-scale Federated Learning is the communication overhead on pushing local gradients and pulling global model. In this paper, to reduce the communication complexity of Federated Learning, a novel approach named Pulling Reduction with Local Compensation (PRLC) is proposed. Specifically, each training node intermittently pulls the global model from the server in SGD iterations, resulting in that it is sometimes unsynchronized with the server. In such a case, it will use its local update to compensate the gap between the local model and the global model. Our rigorous theoretical analysis of PRLC achieves two important findings. First, we prove that the convergence rate of PRLC preserves the same order as the classical synchronous SGD for both strongly-convex and non-convex cases with good scalability due to the linear speedup with respect to the number of training nodes. Second, we show that PRLC admits lower pulling frequency than the existing pulling reduction method without local compensation. We also conduct extensive experiments on various machine learning models to validate our theoretical results. Experimental results show that our approach achieves a significant pulling reduction over the state-of-the-art methods, e.g., PRLC requiring only half of the pulling operations of LAG.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2022

Communication-Efficient Adaptive Federated Learning

Federated learning is a machine learning training paradigm that enables ...
research
09/29/2017

Convergence Analysis of Distributed Stochastic Gradient Descent with Shuffling

When using stochastic gradient descent to solve large-scale machine lear...
research
04/15/2021

D-Cliques: Compensating NonIIDness in Decentralized Federated Learning with Topology

The convergence speed of machine learning models trained with Federated ...
research
12/15/2021

Communication-Efficient Distributed SGD with Compressed Sensing

We consider large scale distributed optimization over a set of edge devi...
research
12/12/2019

Parallel Restarted SPIDER – Communication Efficient Distributed Nonconvex Optimization with Optimal Computation Complexity

In this paper, we propose a distributed algorithm for stochastic smooth,...
research
05/24/2023

Local SGD Accelerates Convergence by Exploiting Second Order Information of the Loss Function

With multiple iterations of updates, local statistical gradient descent ...
research
09/24/2022

Communication-Efficient Federated Learning Using Censored Heavy Ball Descent

Distributed machine learning enables scalability and computational offlo...

Please sign up or login with your details

Forgot password? Click here to reset