PrivFL: Practical Privacy-preserving Federated Regressions on High-dimensional Data over Mobile Networks
Federated Learning (FL) enables a large number of users to jointly learn a shared machine learning (ML) model, coordinated by a centralized server, where the data is distributed across multiple devices. This approach enables the server or users to train and learn an ML model using gradient descent, while keeping all the training data on users' devices. We consider training an ML model over a mobile network where user dropout is a common phenomenon. Although federated learning was aimed at reducing data privacy risks, the ML model privacy has not received much attention. In this work, we present PrivFL, a privacy-preserving system for training (predictive) linear and logistic regression models and oblivious predictions in the federated setting, while guaranteeing data and model privacy as well as ensuring robustness to users dropping out in the network. We design two privacy-preserving protocols for training linear and logistic regression models based on an additive homomorphic encryption (HE) scheme and an aggregation protocol. Exploiting the training algorithm of federated learning, at the core of our training protocols is a secure multiparty global gradient computation on alive users' data. We analyze the security of our training protocols against semi-honest adversaries. As long as the aggregation protocol is secure under the aggregation privacy game and the additive HE scheme is semantically secure, PrivFL guarantees the users' data privacy against the server, and the server's regression model privacy against the users. We demonstrate the performance of PrivFL on real-world datasets and show its applicability in the federated learning system.
READ FULL TEXT