A multiplicative masking method for preserving the skewness of the original micro-records

12/07/2017
by   Nicolas Ruiz, et al.
0

Masking methods for the safe dissemination of microdata consist of distorting the original data while preserving a pre-defined set of statistical properties in the microdata. For continuous variables, available methodologies rely essentially on matrix masking and in particular on adding noise to the original values, using more or less refined procedures depending on the extent of information that one seeks to preserve. Almost all of these methods make use of the critical assumption that the original datasets follow a normal distribution and/or that the noise has such a distribution. This assumption is, however, restrictive in the sense that few variables follow empirically a Gaussian pattern: the distribution of household income, for example, is positively skewed, and this skewness is essential information that has to be considered and preserved. This paper addresses these issues by presenting a simple multiplicative masking method that preserves skewness of the original data while offering a sufficient level of disclosure risk control. Numerical examples are provided, leading to the suggestion that this method could be well-suited for the dissemination of a broad range of microdata, including those based on administrative and business records.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/14/2023

Obfuscation of Discrete Data

Data obfuscation deals with the problem of masking a data-set in such a ...
research
09/02/2023

Testing for sufficient follow-up in censored survival data by using extremes

In survival analysis, it often happens that some individuals, referred t...
research
07/10/2018

Multi-D Kneser-Ney Smoothing Preserving the Original Marginal Distributions

Smoothing is an essential tool in many NLP tasks, therefore numerous tec...
research
10/19/2012

Sufficient Dimensionality Reduction with Irrelevant Statistics

The problem of finding a reduced dimensionality representation of catego...
research
01/24/2022

The Image Deblurring Problem: Matrices, Wavelets, and Multilevel Methods

The image deblurring problem consists of reconstructing images from blur...
research
01/25/2022

How Hard is Safe Bribery?

Bribery in an election is one of the well-studied control problems in co...
research
06/09/2020

Exact and asymptotic properties of δ-records in the linear drift model

The study of records in the Linear Drift Model (LDM) has attracted much ...

Please sign up or login with your details

Forgot password? Click here to reset