Secure Machine Learning over Relational Data

by   Qiyao Luo, et al.

A closer integration of machine learning and relational databases has gained steam in recent years due to the fact that the training data to many ML tasks is the results of a relational query (most often, a join-select query). In a federated setting, this poses an additional challenge, that the tables are held by different parties as their private data, and the parties would like to train the model without having to use a trusted third party. Existing work has only considered the case where the training data is stored in a flat table that has been vertically partitioned, which corresponds to a simple PK-PK join. In this paper, we describe secure protocols to compute the join results of multiple tables conforming to a general foreign-key acyclic schema, and how to feed the results in secret-shared form to a secure ML toolbox. Furthermore, existing secure ML systems reveal the PKs in the join results. We strengthen the privacy protection to higher levels and achieve zero information leakage beyond the trained model. If the model itself is considered sensitive, we show how differential privacy can be incorporated into our framework to also prevent the model from breaching individuals' privacy.


page 1

page 2

page 3

page 4


CaPC Learning: Confidential and Private Collaborative Learning

Machine learning benefits from large training datasets, which may not al...

Efficient Deep Learning on Multi-Source Private Data

Machine learning models benefit from large and diverse datasets. Using s...

Distributed and Secure ML with Self-tallying Multi-party Aggregation

Privacy preserving multi-party computation has many applications in area...

BEAS: Blockchain Enabled Asynchronous Secure Federated Machine Learning

Federated Learning (FL) enables multiple parties to distributively train...

Model Joins: Enabling Analytics Over Joins of Absent Big Tables

This work is motivated by two key facts. First, it is highly desirable t...

Privacy-preserving Transfer Learning via Secure Maximum Mean Discrepancy

The success of machine learning algorithms often relies on a large amoun...

Selection of BJI configuration: Approach based on minimal transversals

Decision systems deal with a large volume of data stored in new database...

Please sign up or login with your details

Forgot password? Click here to reset