SecureBoost: A Lossless Federated Learning Framework

01/25/2019
by   Kewei Cheng, et al.
6

The protection of user privacy is an important concern in machine learning, as evidenced by the rolling out of the General Data Protection Regulation (GDPR) in the European Union (EU) in May 2018. The GDPR is designed to give users more control over their personal data, which motivates us to explore machine learning frameworks with data sharing without violating user privacy. To meet this goal, in this paper, we propose a novel lossless privacy-preserving tree-boosting system known as SecureBoost in the setting of federated learning. This federated-learning system allows a learning process to be jointly conducted over multiple parties with partially common user samples but different feature sets, which corresponds to a vertically partitioned virtual data set. An advantage of SecureBoost is that it provides the same level of accuracy as the non-privacy-preserving approach while at the same time, reveal no information of each private data provider. We theoretically prove that the SecureBoost framework is as accurate as other non-federated gradient tree-boosting algorithms that bring the data into one place. In addition, along with a proof of security, we discuss what would be required to make the protocols completely secure.

READ FULL TEXT

page 3

page 4

page 5

page 6

page 8

page 9

page 10

page 11

research
07/24/2019

Boosting Privately: Privacy-Preserving Federated Extreme Boosting for Mobile Crowdsensing

The state-of-the-art federated learning brings a new direction for the d...
research
03/24/2020

Learn to Forget: User-Level Memorization Elimination in Federated Learning

Federated learning is a decentralized machine learning technique that ev...
research
07/14/2020

Privacy Preserving Text Recognition with Gradient-Boosting for Federated Learning

Typical machine learning approaches require centralized data for model t...
research
06/30/2022

Privacy-preserving Graph Analytics: Secure Generation and Federated Learning

Directly motivated by security-related applications from the Homeland Se...
research
04/01/2020

Beyond privacy regulations: an ethical approach to data usage in transportation

With the exponential advancement of business technology in recent years,...
research
02/06/2020

Privacy-Preserving Boosting in the Local Setting

In machine learning, boosting is one of the most popular methods that de...
research
05/24/2019

Federated Forest

Most real-world data are scattered across different companies or governm...

Please sign up or login with your details

Forgot password? Click here to reset