DPXPlain: Privately Explaining Aggregate Query Answers

09/02/2022
by   Yuchao Tao, et al.
0

Differential privacy (DP) is the state-of-the-art and rigorous notion of privacy for answering aggregate database queries while preserving the privacy of sensitive information in the data. In today's era of data analysis, however, it poses new challenges for users to understand the trends and anomalies observed in the query results: Is the unexpected answer due to the data itself, or is it due to the extra noise that must be added to preserve DP? In the second case, even the observation made by the users on query results may be wrong. In the first case, can we still mine interesting explanations from the sensitive data while protecting its privacy? To address these challenges, we present a three-phase framework DPXPlain, which is the first system to the best of our knowledge for explaining group-by aggregate query answers with DP. In its three phases, DPXPlain (a) answers a group-by aggregate query with DP, (b) allows users to compare aggregate values of two groups and with high probability assesses whether this comparison holds or is flipped by the DP noise, and (c) eventually provides an explanation table containing the approximately `top-k' explanation predicates along with their relative influences and ranks in the form of confidence intervals, while guaranteeing DP in all steps. We perform an extensive experimental analysis of DPXPlain with multiple use-cases on real and synthetic data showing that DPXPlain efficiently provides insightful explanations with good accuracy and utility.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2019

Utility-Preserving Privacy Mechanisms for Counting Queries

Differential privacy (DP) and local differential privacy (LPD) are frame...
research
03/12/2021

Reptile: Aggregation-level Explanations for Hierarchical Data

Recent query explanation systems help users understand anomalies in aggr...
research
12/29/2018

Explaining Aggregates for Exploratory Analytics

Analysts wishing to explore multivariate data spaces, typically pose que...
research
03/01/2023

What Are the Chances? Explaining the Epsilon Parameter in Differential Privacy

Differential privacy (DP) is a mathematical privacy notion increasingly ...
research
02/14/2022

HUT: Enabling High-UTility, Batched Queries under Differential Privacy Protection for Internet-of-Vehicles

The emerging trends of Internet-of-Vehicles (IoV) demand centralized ser...
research
02/09/2023

Pushing the Boundaries of Private, Large-Scale Query Answering

We address the problem of efficiently and effectively answering large nu...
research
03/29/2021

Putting Things into Context: Rich Explanations for Query Answers using Join Graphs (extended version)

In many data analysis applications, there is a need to explain why a sur...

Please sign up or login with your details

Forgot password? Click here to reset