DeepAI AI Chat
Log In Sign Up

Castell: Scalable Joint Probability Estimation of Multi-dimensional Data Randomized with Local Differential Privacy

by   Hiroaki Kikuchi, et al.
Generalitat de Catalunya
Meiji University
Universitat Rovira i Virgili

Performing randomized response (RR) over multi-dimensional data is subject to the curse of dimensionality. As the number of attributes increases, the exponential growth in the number of attribute-value combinations greatly impacts the computational cost and the accuracy of the RR estimates. In this paper, we propose a new multi-dimensional RR scheme that randomizes all attributes independently, and then aggregates these randomization matrices into a single aggregated matrix. The multi-dimensional joint probability distributions are then estimated. The inverse matrix of the aggregated randomization matrix can be computed efficiently at a lightweight computation cost (i.e., linear with respect to dimensionality) and with manageable storage requirements. To overcome the limitation of accuracy, we propose two extensions to the baseline protocol, called hybrid and truncated schemes. Finally, we have conducted experiments using synthetic and major open-source datasets for various numbers of attributes, domain sizes, and numbers of respondents. The results using UCI Adult dataset give average distances between the estimated and the real (2 through 6-way) joint probability are 0.0099 for truncated and 0.0155 for hybrid schemes, whereas they are 0.03 and 0.04 for LoPub, which is the state-of-the-art multi-dimensional LDP scheme.


Multi-Dimensional Randomized Response

In our data world, a host of not necessarily trusted controllers gather ...

Answering Multi-Dimensional Range Queries under Local Differential Privacy

In this paper, we tackle the problem of answering multi-dimensional rang...

Modeling Multi-Dimensional Datasets via a Fast Scale-Free Network Model

Compared with network datasets, multi-dimensional data are much more com...

User Evaluation of a Multi-dimensional Statistical Dialogue System

We present the first complete spoken dialogue system driven by a multi-d...

Task-aware Privacy Preservation for Multi-dimensional Data

Local differential privacy (LDP), a state-of-the-art technique for priva...

Local intrinsic dimensionality estimators based on concentration of measure

Intrinsic dimensionality (ID) is one of the most fundamental characteris...