Answering Count Queries for Genomic Data with Perfect Privacy

02/03/2022
by   Bo Jiang, et al.
0

In this paper, we consider the problem of answering count queries for genomic data subject to perfect privacy constraints. Count queries are often used in applications that collect aggregate (population-wide) information from biomedical Databases (DBs) for analysis, such as Genome-wide association studies. Our goal is to design mechanisms for answering count queries of the following form: How many users in the database have a specific set of genotypes at certain locations in their genome? At the same time, we aim to achieve perfect privacy (zero information leakage) of the sensitive genotypes at a pre-specified set of secret locations. The sensitive genotypes could indicate rare diseases and/or other health traits that one may want to keep private. We present two local count-query mechanisms for the above problem that achieve perfect privacy for sensitive genotypes while minimizing the expected absolute error (or per-user error probability) of the query answer. We also derived a lower bound of the per-user probability of error for an arbitrary query answering mechanism that satisfies perfect privacy. We show that our mechanisms achieve error that is close to the lower bound, and are match the lower bound for some special cases. We numerically show that the performance of each mechanism depends on the data prior distribution, the intersection between the queried and sensitive data, and the strength of the correlation in the genomic data sequence.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

research
07/18/2020

Differentially Private Mechanisms for Count Queries

In this paper, we consider the problem of responding to a count query (o...
research
10/04/2020

Privately Answering Counting Queries with Generalized Gaussian Mechanisms

We consider the problem of answering k counting (i.e. sensitivity-1) que...
research
10/05/2018

Linear Queries Estimation with Local Differential Privacy

We study the problem of estimating a set of d linear queries with respec...
research
10/01/2020

Understanding the hardness of approximate query processing with joins

We study the hardness of Approximate Query Processing (AQP) of various t...
research
12/16/2020

On Avoiding the Union Bound When Answering Multiple Differentially Private Queries

In this work, we study the problem of answering k queries with (ϵ, δ)-di...
research
10/20/2020

Non-Stochastic Private Function Evaluation

We consider private function evaluation to provide query responses based...
research
02/24/2022

Analysis of Genotype-Phenotype Association using Fields and Information Theory

We show how field- and information theory can be used to quantify the re...

Please sign up or login with your details

Forgot password? Click here to reset