Private DNA Sequencing: Hiding Information in Discrete Noise

01/28/2021
by   Kayvon Mazooji, et al.
0

When an individual's DNA is sequenced, sensitive medical information becomes available to the sequencing laboratory. A recently proposed way to hide an individual's genetic information is to mix in DNA samples of other individuals. We assume these samples are known to the individual but unknown to the sequencing laboratory. Thus, these DNA samples act as "noise" to the sequencing laboratory, but still allow the individual to recover their own DNA samples afterward. Motivated by this idea, we study the problem of hiding a binary random variable X (a genetic marker) with the additive noise provided by mixing DNA samples, using mutual information as a privacy metric. This is equivalent to the problem of finding a worst-case noise distribution for recovering X from the noisy observation among a set of feasible discrete distributions. We characterize upper and lower bounds to the solution of this problem, which are empirically shown to be very close. The lower bound is obtained through a convex relaxation of the original discrete optimization problem, and yields a closed-form expression. The upper bound is computed via a greedy algorithm for selecting the mixing proportions.

READ FULL TEXT
research
09/26/2021

The DNA Storage Channel: Capacity and Error Probability

The DNA storage channel is considered, in which M Deoxyribonucleic acid ...
research
02/10/2021

Trace Reconstruction with Bounded Edit Distance

The trace reconstruction problem studies the number of noisy samples nee...
research
03/29/2019

Private Shotgun DNA Sequencing: A Structured Approach

Current techniques in sequencing a genome allow a service provider (e.g....
research
12/01/2020

DNA mixture deconvolution using an evolutionary algorithm with multiple populations, hill-climbing, and guided mutation

DNA samples crime cases analysed in forensic genetics, frequently contai...
research
03/26/2021

Genomic Encryption of Biometric Information for Privacy-Preserving Forensics

DNA fingerprinting is a cornerstone for human identification in forensic...
research
03/15/2023

LRDB: LSTM Raw data DNA Base-caller based on long-short term models in an active learning environment

The first important step in extracting DNA characters is using the outpu...
research
02/27/2018

A unifying framework for the modelling and analysis of STR DNA samples arising in forensic casework

This paper presents a new framework for analysing forensic DNA samples u...

Please sign up or login with your details

Forgot password? Click here to reset