Privacy-preserving Inference of Group Mean Difference in Zero-inflated Right Skewed Data with Partitioning and Censoring

04/10/2023
by   Fang Liu, et al.
0

We examine privacy-preserving inferences of group mean differences in zero-inflated right-skewed (zirs) data. Zero inflation and right skewness are typical characteristics of ads clicks and purchases data collected from e-commerce and social media platforms, where we also want to preserve user privacy to ensure that individual data is protected. In this work, we develop likelihood-based and model-free approaches to analyzing zirs data with formal privacy guarantees. We first apply partitioning and censoring (PAC) to “regularize” zirs data to get the PAC data. We expect inferences based on PAC to have better inferential properties and more robust privacy considerations compared to analyzing the raw data directly. We conduct theoretical analysis to establish the MSE consistency of the privacy-preserving estimators from the proposed approaches based on the PAC data and examine the rate of convergence in the number of partitions and privacy loss parameters. The theoretical results also suggest that it is the sampling error of PAC data rather than the sanitization error that is the limiting factor in the convergence rate. We conduct extensive simulation studies to compare the inferential utility of the proposed approach for different types of zirs data, sample size and partition size combinations, censoring scenarios, mean differences, privacy budgets, and privacy loss composition schemes. We also apply the methods to obtain privacy-preserving inference for the group mean difference in a real digital ads click-through data set. Based on the theoretical and empirical results, we make recommendations regarding the usage of these methods in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2021

Privacy-preserving Publication and Sharing of COVID-19 Pandemic Data

A huge amount of data of various types are collected during the COVID-19...
research
06/04/2020

Median regression with differential privacy

Median regression analysis has robustness properties which make it attra...
research
01/20/2022

Survey on Privacy-Preserving Techniques for Data Publishing

The exponential growth of collected, processed, and shared microdata has...
research
09/29/2022

No Free Lunch in "Privacy for Free: How does Dataset Condensation Help Privacy"

New methods designed to preserve data privacy require careful scrutiny. ...
research
05/26/2023

Seeding with Differentially Private Network Information

When designing interventions in public health, development, and educatio...
research
07/22/2019

On the Information Privacy Model: the Group and Composition Privacy

How to query a dataset in the way of preserving the privacy of individua...

Please sign up or login with your details

Forgot password? Click here to reset