On the Convergence of FedAvg on Non-IID Data

07/04/2019
by   Xiang Li, et al.
0

Federated learning enables a large amount of edge computing devices to learn a centralized model while keeping all local data on edge devices. As a leading algorithm in this setting, Federated Averaging (FedAvg) runs Stochastic Gradient Descent (SGD) in parallel on a small subset of the total devices and averages the sequences only once in a while. Despite its simplicity, it lacks theoretical guarantees in the federated setting. In this paper, we analyze the convergence of FedAvg on non-iid data. We investigate the effect of different sampling and averaging schemes, which are crucial especially when data are unbalanced. We prove a concise convergence rate of O(1/T) for FedAvg with proper sampling and averaging schemes in convex problems, where T is the total number of steps. Our results show that heterogeneity of data slows down the convergence, which is intrinsic in the federated setting. Low device participation rate can be achieved without severely harming the optimization process in federated learning. We show that there is a trade-off between communication efficiency and convergence rate. We analyze the necessity of learning rate decay by taking a linear regression as an example. Our work serves as a guideline for algorithm design in applications of federated learning, where heterogeneity and unbalance of data are the common case.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2022

A Convergence Theory for Federated Average: Beyond Smoothness

Federated learning enables a large amount of edge computing devices to l...
research
02/12/2020

Towards Federated Learning: Robustness Analytics to Data Heterogeneity

Federated Learning allows remote centralized server training models with...
research
12/09/2021

On Convergence of Federated Averaging Langevin Dynamics

We propose a federated averaging Langevin algorithm (FA-LD) for uncertai...
research
09/14/2020

Effective Federated Adaptive Gradient Methods with Non-IID Decentralized Data

Federated learning allows loads of edge computing devices to collaborati...
research
06/19/2020

DEED: A General Quantization Scheme for Communication Efficiency in Bits

In distributed optimization, a popular technique to reduce communication...
research
12/31/2020

Federated Nonconvex Sparse Learning

Nonconvex sparse learning plays an essential role in many areas, such as...
research
12/10/2020

Analysis and Optimal Edge Assignment For Hierarchical Federated Learning on Non-IID Data

Distributed learning algorithms aim to leverage distributed and diverse ...

Please sign up or login with your details

Forgot password? Click here to reset