Castell: Scalable Joint Probability Estimation of Multi-dimensional Data Randomized with Local Differential Privacy

12/03/2022
by   Hiroaki Kikuchi, et al.
0

Performing randomized response (RR) over multi-dimensional data is subject to the curse of dimensionality. As the number of attributes increases, the exponential growth in the number of attribute-value combinations greatly impacts the computational cost and the accuracy of the RR estimates. In this paper, we propose a new multi-dimensional RR scheme that randomizes all attributes independently, and then aggregates these randomization matrices into a single aggregated matrix. The multi-dimensional joint probability distributions are then estimated. The inverse matrix of the aggregated randomization matrix can be computed efficiently at a lightweight computation cost (i.e., linear with respect to dimensionality) and with manageable storage requirements. To overcome the limitation of accuracy, we propose two extensions to the baseline protocol, called hybrid and truncated schemes. Finally, we have conducted experiments using synthetic and major open-source datasets for various numbers of attributes, domain sizes, and numbers of respondents. The results using UCI Adult dataset give average distances between the estimated and the real (2 through 6-way) joint probability are 0.0099 for truncated and 0.0155 for hybrid schemes, whereas they are 0.03 and 0.04 for LoPub, which is the state-of-the-art multi-dimensional LDP scheme.

READ FULL TEXT
research
10/21/2020

Multi-Dimensional Randomized Response

In our data world, a host of not necessarily trusted controllers gather ...
research
09/14/2020

Answering Multi-Dimensional Range Queries under Local Differential Privacy

In this paper, we tackle the problem of answering multi-dimensional rang...
research
11/05/2022

Modeling Multi-Dimensional Datasets via a Fast Scale-Free Network Model

Compared with network datasets, multi-dimensional data are much more com...
research
12/18/2021

The Kolmogorov Superposition Theorem can Break the Curse of Dimensionality When Approximating High Dimensional Functions

We explain how to use Kolmogorov's Superposition Theorem (KST) to overco...
research
09/06/2019

User Evaluation of a Multi-dimensional Statistical Dialogue System

We present the first complete spoken dialogue system driven by a multi-d...
research
10/05/2021

Task-aware Privacy Preservation for Multi-dimensional Data

Local differential privacy (LDP), a state-of-the-art technique for priva...
research
05/02/2019

SUMMARIZED: Efficient Framework for Analyzing Multidimensional Process Traces under Edit-distance Constraint

Domains such as scientific workflows and business processes exhibit data...

Please sign up or login with your details

Forgot password? Click here to reset