Communication-Efficient Agnostic Federated Averaging

04/06/2021
by   Jae Ro, et al.
5

In distributed learning settings such as federated learning, the training algorithm can be potentially biased towards different clients. Mohri et al. (2019) proposed a domain-agnostic learning algorithm, where the model is optimized for any target distribution formed by a mixture of the client distributions in order to overcome this bias. They further proposed an algorithm for the cross-silo federated learning setting, where the number of clients is small. We consider this problem in the cross-device setting, where the number of clients is much larger. We propose a communication-efficient distributed algorithm called Agnostic Federated Averaging (or AgnosticFedAvg) to minimize the domain-agnostic objective proposed in Mohri et al. (2019), which is amenable to other private mechanisms such as secure aggregation. We highlight two types of naturally occurring domains in federated learning and argue that AgnosticFedAvg performs well on both. To demonstrate the practical effectiveness of AgnosticFedAvg, we report positive results for large-scale language modeling tasks in both simulation and live experiments, where the latter involves training language models for Spanish virtual keyboard for millions of user devices.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2019

Agnostic Federated Learning

A key learning scenario in large-scale applications is that of federated...
research
04/21/2021

Gradient Masked Federated Optimization

Federated Averaging (FedAVG) has become the most popular federated learn...
research
08/10/2021

FedPAGE: A Fast Local Stochastic Gradient Method for Communication-Efficient Federated Learning

Federated Averaging (FedAvg, also known as Local-SGD) (McMahan et al., 2...
research
02/05/2021

Federated Reconstruction: Partially Local Federated Learning

Personalization methods in federated learning aim to balance the benefit...
research
09/18/2019

Detailed comparison of communication efficiency of split learning and federated learning

We compare communication efficiencies of two compelling distributed mach...
research
02/18/2020

Distributed Optimization over Block-Cyclic Data

We consider practical data characteristics underlying federated learning...
research
07/18/2023

Towards Federated Foundation Models: Scalable Dataset Pipelines for Group-Structured Learning

We introduce a library, Dataset Grouper, to create large-scale group-str...

Please sign up or login with your details

Forgot password? Click here to reset