DeepAI AI Chat
Log In Sign Up

On Pre-Training for Federated Learning

by   Hong-You Chen, et al.
The Ohio State University

In most of the literature on federated learning (FL), neural networks are initialized with random weights. In this paper, we present an empirical study on the effect of pre-training on FL. Specifically, we aim to investigate if pre-training can alleviate the drastic accuracy drop when clients' decentralized data are non-IID. We focus on FedAvg, the fundamental and most widely used FL algorithm. We found that pre-training does largely close the gap between FedAvg and centralized learning under non-IID data, but this does not come from alleviating the well-known model drifting problem in FedAvg's local training. Instead, how pre-training helps FedAvg is by making FedAvg's global aggregation more stable. When pre-training using real data is not feasible for FL, we propose a novel approach to pre-train with synthetic data. On various image datasets (including one for segmentation), our approach with synthetic pre-training leads to a notable gain, essentially a critical step toward scaling up federated learning for real-world applications.


page 7

page 15


CyclicFL: A Cyclic Model Pre-Training Approach to Efficient Federated Learning

Since random initial models in Federated Learning (FL) can easily result...

FDAPT: Federated Domain-adaptive Pre-training for Language Models

Combining Domain-adaptive Pre-training (DAPT) with Federated Learning (F...

Can Fair Federated Learning reduce the need for Personalisation?

Federated Learning (FL) enables training ML models on edge clients witho...

FedAUX: Leveraging Unlabeled Auxiliary Data in Federated Learning

Federated Distillation (FD) is a popular novel algorithmic paradigm for ...

FedFwd: Federated Learning without Backpropagation

In federated learning (FL), clients with limited resources can disrupt t...

Federated Learning of Medical Concepts Embedding using BEHRT

Electronic Health Records (EHR) data contains medical records such as di...

A Pre-training Oracle for Predicting Distances in Social Networks

In this paper, we propose a novel method to make distance predictions in...