Data Station: Delegated, Trustworthy, and Auditable Computation to Enable Data-Sharing Consortia with a Data Escrow

05/05/2023
by   Siyuan Xia, et al.
0

Pooling and sharing data increases and distributes its value. But since data cannot be revoked once shared, scenarios that require controlled release of data for regulatory, privacy, and legal reasons default to not sharing. Because selectively controlling what data to release is difficult, the few data-sharing consortia that exist are often built around data-sharing agreements resulting from long and tedious one-off negotiations. We introduce Data Station, a data escrow designed to enable the formation of data-sharing consortia. Data owners share data with the escrow knowing it will not be released without their consent. Data users delegate their computation to the escrow. The data escrow relies on delegated computation to execute queries without releasing the data first. Data Station leverages hardware enclaves to generate trust among participants, and exploits the centralization of data and computation to generate an audit log. We evaluate Data Station on machine learning and data-sharing applications while running on an untrusted intermediary. In addition to important qualitative advantages, we show that Data Station: i) outperforms federated learning baselines in accuracy and runtime for the machine learning application; ii) is orders of magnitude faster than alternative secure data-sharing frameworks; and iii) introduces small overhead on the critical path.

READ FULL TEXT
research
11/13/2020

Federated Learning System without Model Sharing through Integration of Dimensional Reduced Data Representations

Dimensionality Reduction is a commonly used element in a machine learnin...
research
12/07/2018

A Hybrid Approach to Privacy-Preserving Federated Learning

Training machine learning models often requires data from multiple parti...
research
03/31/2023

Benchmarking FedAvg and FedCurv for Image Classification Tasks

Classic Machine Learning techniques require training on data available i...
research
01/19/2022

Towards Energy Efficient Distributed Federated Learning for 6G Networks

The provision of communication services via portable and mobile devices,...
research
12/01/2020

MYSTIKO : : Cloud-Mediated, Private, Federated Gradient Descent

Federated learning enables multiple, distributed participants (potential...
research
02/28/2018

Machine learning and genomics: precision medicine vs. patient privacy

Machine learning can have major societal impact in computational biology...
research
02/13/2018

DataBright: Towards a Global Exchange for Decentralized Data Ownership and Trusted Computation

It is safe to assume that, for the foreseeable future, machine learning,...

Please sign up or login with your details

Forgot password? Click here to reset