Data Appraisal Without Data Sharing

12/11/2020
by   Mimee Xu, et al.
0

One of the most effective approaches to improving the performance of a machine-learning model is to acquire additional training data. To do so, a model owner may seek to acquire relevant training data from a data owner. Before procuring the data, the model owner needs to appraise the data. However, the data owner generally does not want to share the data until after an agreement is reached. The resulting Catch-22 prevents efficient data markets from forming. To address this problem, we develop data appraisal methods that do not require data sharing by using secure multi-party computation. Specifically, we study methods that: (1) compute parameter gradient norms, (2) perform model fine-tuning, and (3) compute influence functions. Our experiments show that influence functions provide an appealing trade-off between high-quality appraisal and required computation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2022

Spatial data sharing with secure multi-party computation for exploratory spatial data analysis

Spatial data sharing plays a significant role in opening data research a...
research
11/11/2022

Striving for data-model efficiency: Identifying data externalities on group performance

Building trustworthy, effective, and responsible machine learning system...
research
01/08/2019

Contamination Attacks and Mitigation in Multi-Party Machine Learning

Machine learning is data hungry; the more data a model has access to in ...
research
12/23/2020

The structure of behavioral data

For more than a century, scientists have been collecting behavioral data...
research
07/02/2019

Secure Computation in Decentralized Data Markets

Decentralized data markets gather data from many contributors to create ...
research
09/12/2017

Interpreting Shared Deep Learning Models via Explicable Boundary Trees

Despite outperforming the human in many tasks, deep neural network model...
research
05/07/2022

Quantifying and Extrapolating Data Needs in Radio Frequency Machine Learning

Understanding the relationship between training data and a model's perfo...

Please sign up or login with your details

Forgot password? Click here to reset