Differentially Private Linear Regression with Linked Data

08/01/2023
by   Shurong Lin, et al.
0

There has been increasing demand for establishing privacy-preserving methodologies for modern statistics and machine learning. Differential privacy, a mathematical notion from computer science, is a rising tool offering robust privacy guarantees. Recent work focuses primarily on developing differentially private versions of individual statistical and machine learning tasks, with nontrivial upstream pre-processing typically not incorporated. An important example is when record linkage is done prior to downstream modeling. Record linkage refers to the statistical task of linking two or more data sets of the same group of entities without a unique identifier. This probabilistic procedure brings additional uncertainty to the subsequent task. In this paper, we present two differentially private algorithms for linear regression with linked data. In particular, we propose a noisy gradient method and a sufficient statistics perturbation approach for the estimation of regression coefficients. We investigate the privacy-accuracy tradeoff by providing finite-sample error bounds for the estimators, which allows us to understand the relative contributions of linkage error, estimation error, and the cost of privacy. The variances of the estimators are also discussed. We demonstrate the performance of the proposed algorithms through simulations and an application to synthetic data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2023

DPpack: An R Package for Differentially Private Statistical Analysis and Machine Learning

Differential privacy (DP) is the state-of-the-art framework for guarante...
research
11/22/2019

Privacy-preserving parametric inference: a case for robust statistics

Differential privacy is a cryptographically-motivated approach to privac...
research
03/07/2018

Revisiting differentially private linear regression: optimal and adaptive prediction & estimation in unbounded domain

We revisit the problem of linear regression under a differential privacy...
research
08/20/2018

An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices

Statistical agencies face a dual mandate to publish accurate statistics ...
research
07/10/2020

Differentially Private Simple Linear Regression

Economics and social science research often require analyzing datasets o...
research
08/15/2022

Easy Differentially Private Linear Regression

Linear regression is a fundamental tool for statistical analysis. This h...
research
03/07/2023

PRIMO: Private Regression in Multiple Outcomes

We introduce a new differentially private regression setting we call Pri...

Please sign up or login with your details

Forgot password? Click here to reset