FedDistill: Making Bayesian Model Ensemble Applicable to Federated Learning
Federated learning aims to leverage users' own data and computational resources in learning a strong global model, without directly accessing their data but only local models. It usually requires multiple rounds of communication, in which aggregating local models into a global model plays an important role. In this paper, we propose a novel aggregation scenario and algorithm named FedDistill, which enjoys the robustness of Bayesian model ensemble in aggregating users' predictions and employs knowledge distillation to summarize the ensemble predictions into a global model, with the help of unlabeled data collected at the server. Our empirical studies validate FedDistill's superior performance, especially when users' data are not i.i.d. and the neural networks go deeper. Moreover, FedDistill is compatible with recent efforts in regularizing users' model training, making it an easily applicable module: you only need to replace the aggregation method but leave other parts of your federated learning algorithms intact.
READ FULL TEXT