FedKD: Communication Efficient Federated Learning via Knowledge Distillation

by   Chuhan Wu, et al.

Federated learning is widely used to learn intelligent models from decentralized data. In federated learning, clients need to communicate their local model updates in each iteration of model learning. However, model updates are large in size if the model contains numerous parameters, and there usually needs many rounds of communication until model converges. Thus, the communication cost in federated learning can be quite heavy. In this paper, we propose a communication efficient federated learning method based on knowledge distillation. Instead of directly communicating the large models between clients and server, we propose an adaptive mutual distillation framework to reciprocally learn a student and a teacher model on each client, where only the student model is shared by different clients and updated collaboratively to reduce the communication cost. Both the teacher and student on each client are learned on its local data and the knowledge distilled from each other, where their distillation intensities are controlled by their prediction quality. To further reduce the communication cost, we propose a dynamic gradient approximation method based on singular value decomposition to approximate the exchanged gradients with dynamic precision. Extensive experiments on benchmark datasets in different tasks show that our approach can effectively reduce the communication cost and achieve competitive results.



page 1

page 2

page 3

page 4


Asynchronous Edge Learning using Cloned Knowledge Distillation

With the increasing demand for more and more data, the federated learnin...

Communication-Efficient Adaptive Federated Learning

Federated learning is a machine learning training paradigm that enables ...

Communication-Efficient Federated Distillation

Communication constraints are one of the major challenges preventing the...

Adaptive Distillation for Decentralized Learning from Heterogeneous Clients

This paper addresses the problem of decentralized learning to achieve a ...

Distilled One-Shot Federated Learning

Current federated learning algorithms take tens of communication rounds ...

FedSPLIT: One-Shot Federated Recommendation System Based on Non-negative Joint Matrix Factorization and Knowledge Distillation

Non-negative matrix factorization (NMF) with missing-value completion is...

Towards Model Agnostic Federated Learning Using Knowledge Distillation

An often unquestioned assumption underlying most current federated learn...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.