On the Convergence of Local Descent Methods in Federated Learning

10/31/2019
by   Farzin Haddadpour, et al.
0

In federated distributed learning, the goal is to optimize a global training objective defined over distributed devices, where the data shard at each device is sampled from a possibly different distribution (a.k.a., heterogeneous or non i.i.d. data samples). In this paper, we generalize the local stochastic and full gradient descent with periodic averaging– originally designed for homogeneous distributed optimization, to solve nonconvex optimization problems in federated learning. Although scant research is available on the effectiveness of local SGD in reducing the number of communication rounds in homogeneous setting, its convergence and communication complexity in heterogeneous setting is mostly demonstrated empirically and lacks through theoretical understating. To bridge this gap, we demonstrate that by properly analyzing the effect of unbiased gradients and sampling schema in federated setting, under mild assumptions, the implicit variance reduction feature of local distributed methods generalize to heterogeneous data shards and exhibits the best known convergence rates of homogeneous setting both in general nonconvex and under   condition (generalization of strong-convexity). Our theoretical results complement the recent empirical studies that demonstrate the applicability of local GD/SGD to federated learning. We also specialize the proposed local method for networked distributed optimization. To the best of our knowledge, the obtained convergence rates are the sharpest known to date on the convergence of local decant methods with periodic averaging for solving nonconvex federated optimization in both centralized and networked distributed optimization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2020

Federated Learning with Compression: Unified Analysis and Sharp Guarantees

In federated learning, communication cost is often a critical bottleneck...
research
02/25/2021

Distributionally Robust Federated Averaging

In this paper, we study communication efficient distributed algorithms f...
research
02/14/2023

EPISODE: Episodic Gradient Clipping with Periodic Resampled Corrections for Federated Learning with Heterogeneous Data

Gradient clipping is an important technique for deep neural networks wit...
research
08/02/2023

Compressed and distributed least-squares regression: convergence rates with applications to Federated Learning

In this paper, we investigate the impact of compression on stochastic gr...
research
12/31/2020

Federated Nonconvex Sparse Learning

Nonconvex sparse learning plays an essential role in many areas, such as...
research
02/05/2021

Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning

Federated learning is one of the important learning scenarios in distrib...
research
01/11/2022

Partial Model Averaging in Federated Learning: Performance Guarantees and Benefits

Local Stochastic Gradient Descent (SGD) with periodic model averaging (F...

Please sign up or login with your details

Forgot password? Click here to reset