Striking a Balance: An Optimal Mechanism Design for Heterogenous Differentially Private Data Acquisition for Logistic Regression

09/19/2023
by   Ameya Anjarlekar, et al.
0

We investigate the problem of performing logistic regression on data collected from privacy-sensitive sellers. Since the data is private, sellers must be incentivized through payments to provide their data. Thus, the goal is to design a mechanism that optimizes a weighted combination of test loss, seller privacy, and payment, i.e., strikes a balance between multiple objectives of interest. We solve the problem by combining ideas from game theory, statistical learning theory, and differential privacy. The buyer's objective function can be highly non-convex. However, we show that, under certain conditions on the problem parameters, the problem can be convexified by using a change of variables. We also provide asymptotic results characterizing the buyer's test error and payments when the number of sellers becomes large. Finally, we demonstrate our ideas by applying them to a real healthcare data set.

READ FULL TEXT
research
01/10/2022

Optimal and Differentially Private Data Acquisition: Central and Local Mechanisms

We consider a platform's problem of collecting data from privacy sensiti...
research
07/30/2014

Differentially-Private Logistic Regression for Detecting Multiple-SNP Association in GWAS Databases

Following the publication of an attack on genome-wide association studie...
research
08/02/2019

Differential Privacy for Sparse Classification Learning

In this paper, we present a differential privacy version of convex and n...
research
07/25/2023

Accuracy Amplification in Differentially Private Logistic Regression: A Pre-Training Approach

Machine learning (ML) models can memorize training datasets. As a result...
research
05/22/2020

Secure and Differentially Private Bayesian Learning on Distributed Data

Data integration and sharing maximally enhance the potential for novel a...
research
01/04/2017

Private Incremental Regression

Data is continuously generated by modern data sources, and a recent chal...
research
05/28/2021

How much telematics information do insurers need for claim classification?

It has been shown several times in the literature that telematics data c...

Please sign up or login with your details

Forgot password? Click here to reset