Information-Theoretic Bounds on the Generalization Error and Privacy Leakage in Federated Learning

by   Semih Yagli, et al.

Machine learning algorithms operating on mobile networks can be characterized into three different categories. First is the classical situation in which the end-user devices send their data to a central server where this data is used to train a model. Second is the distributed setting in which each device trains its own model and send its model parameters to a central server where these model parameters are aggregated to create one final model. Third is the federated learning setting in which, at any given time t, a certain number of active end users train with their own local data along with feedback provided by the central server and then send their newly estimated model parameters to the central server. The server, then, aggregates these new parameters, updates its own model, and feeds the updated parameters back to all the end users, continuing this process until it converges. The main objective of this work is to provide an information-theoretic framework for all of the aforementioned learning paradigms. Moreover, using the provided framework, we develop upper and lower bounds on the generalization error together with bounds on the privacy leakage in the classical, distributed and federated learning settings. Keywords: Federated Learning, Distributed Learning, Machine Learning, Model Aggregation.


FLHub: a Federated Learning model sharing service

As easy-to-use deep learning libraries such as Tensorflow and Pytorch ar...

CPFed: Communication-Efficient and Privacy-Preserving Federated Learning

Federated learning is a machine learning setting where a set of edge dev...

Federated Learning in ASR: Not as Easy as You Think

With the growing availability of smart devices and cloud services, perso...

Federated Two-stage Learning with Sign-based Voting

Federated learning is a distributed machine learning mechanism where loc...

Applied Federated Learning: Architectural Design for Robust and Efficient Learning in Privacy Aware Settings

The classical machine learning paradigm requires the aggregation of user...

Matrix Sketching for Secure Collaborative Machine Learning

Collaborative machine learning (ML), also known as federated ML, allows ...

Rate-Distortion Theoretic Bounds on Generalization Error for Distributed Learning

In this paper, we use tools from rate-distortion theory to establish new...