Data Disclosure under Perfect Sample Privacy
Perfect data privacy seems to be in fundamental opposition to the economical and scientific opportunities associated with extensive data exchange. Defying this intuition, this paper develops a framework that allows the disclosure of collective properties of datasets without compromising the privacy of individual data samples. We present an algorithm to build an optimal disclosure strategy/mapping, and discuss it fundamental limits on finite and asymptotically large datasets. Furthermore, we present explicit expressions to the asymptotic performance of this scheme in some scenarios, and study cases where our approach attains maximal efficiency. We finally discuss suboptimal schemes to provide sample privacy guarantees to large datasets with a reduced computational cost.
READ FULL TEXT