Restricted Local Differential Privacy for Distribution Estimation with High Data Utility
LDP (Local Differential Privacy) has recently attracted much attention as a privacy metric in the local model, in which each user obfuscates her own personal data by herself and the data collector estimates the statistics of the personal data such as a distribution underlying the data. LDP provides a privacy guarantee against adversaries with arbitrary background knowledge, and does not suffer from the data leakage. However, it regards all of the personal data as equally sensitive, which can cause the loss of data utility. In this paper, we introduce the concept of RLDP (Restricted Local Differential Privacy), which provides a privacy guarantee equivalent to LDP only for sensitive data. We first consider the case in which all users use the same obfuscation mechanism, and propose two mechanisms providing RLDP: a restricted RR (Randomized Response) and restricted RAPPOR. We then consider the case in which a mechanism is different from user to user, and propose a personalized restricted mechanism with semantic tags to keep secret what is sensitive for each user while keeping high data utility. We theoretically analyze the data utility of our mechanisms, and prove that our mechanisms provide much higher data utility than the existing mechanisms providing LDP. We also prove that our mechanisms provide almost the same data utility as a non-private mechanism that does not obfuscate the personal data when the privacy budget is epsilon = ln |X|, where X is the set of personal data. We finally show that our mechanisms outperform the existing mechanisms by one or two orders of magnitude using two large-scale datasets.
READ FULL TEXT