Optimal Client Sampling for Federated Learning

by   Wenlin Chen, et al.

It is well understood that client-master communication can be a primary bottleneck in Federated Learning. In this work, we address this issue with a novel client subsampling scheme, where we restrict the number of clients allowed to communicate their updates back to the master node. In each communication round, all participated clients compute their updates, but only the ones with "important" updates communicate back to the master. We show that importance can be measured using only the norm of the update and we give a formula for optimal client participation. This formula minimizes the distance between the full update, where all clients participate, and our limited update, where the number of participating clients is restricted. In addition, we provide a simple algorithm that approximates the optimal formula for client participation which only requires secure aggregation and thus does not compromise client privacy. We show both theoretically and empirically that our approach leads to superior performance for Distributed SGD (DSGD) and Federated Averaging (FedAvg) compared to the baseline where participating clients are sampled uniformly. Finally, our approach is orthogonal to and compatible with existing methods for reducing communication overhead, such as local methods and communication compression methods.


page 1

page 2

page 3

page 4


Communication-Efficient Federated Learning via Optimal Client Sampling

Federated learning is a private and efficient framework for learning mod...

Faster Rates for Compressed Federated Learning with Client-Variance Reduction

Due to the communication bottleneck in distributed and federated learnin...

Timely Communication in Federated Learning

We consider a federated learning framework in which a parameter server (...

Throughput-Optimal Topology Design for Cross-Silo Federated Learning

Federated learning usually employs a client-server architecture where an...

Data Leakage in Federated Averaging

Recent attacks have shown that user data can be recovered from FedSGD up...

Accelerating Federated Learning via Sampling Anchor Clients with Large Batches

Using large batches in recent federated learning studies has improved co...

Communication-Efficient Federated Learning via Robust Distributed Mean Estimation

Federated learning commonly relies on algorithms such as distributed (mi...

Code Repositories