Differential privacy and noisy confidentiality concepts for European population statistics

12/17/2020
by   Fabian Bach, et al.
0

The paper aims to give an overview of various approaches to statistical disclosure control based on random noise that are currently being discussed for official population statistics and censuses. A particular focus is on a stringent delineation between different concepts influencing the discussion: we separate clearly between risk measures, noise distributions and output mechanisms - putting these concepts into scope and into relation with each other. After recapitulating differential privacy as a risk measure, the paper also remarks on utility and risk aspects of some specific output mechanisms and parameter setups, with special attention on static outputs that are rather typical in official population statistics. In particular, it is argued that unbounded noise distributions, such as plain Laplace, may jeopardise key unique census features without a clear need from a risk perspective. On the other hand, bounded noise distributions, such as the truncated Laplace or the cell key method, can be set up to keep unique census features while controlling disclosure risks in census-like outputs. Finally, the paper analyses some typical attack scenarios to constrain generic noise parameter ranges that suggest a good risk/utility compromise for the 2021 EU census output scenario. The analysis also shows that strictly differentially private mechanisms would be severely constrained in this scenario.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset