Hybrid Microaggregation for Privacy-Preserving Data Mining

by   Balkis Abidi, et al.

k-Anonymity by microaggregation is one of the most commonly used anonymization techniques. This success is owe to the achievement of a worth of interest tradeoff between information loss and identity disclosure risk. However, this method may have some drawbacks. On the disclosure limitation side, there is a lack of protection against attribute disclosure. On the data utility side, dealing with a real datasets is a challenging task to achieve. Indeed, the latter are characterized by their large number of attributes and the presence of noisy data, such that outliers or, even, data with missing values. Generating an anonymous individual data useful for data mining tasks, while decreasing the influence of noisy data is a compelling task to achieve. In this paper, we introduce a new microaggregation method, called HM-PFSOM, based on fuzzy possibilistic clustering. Our proposed method operates through an hybrid manner. This means that the anonymization process is applied per block of similar data. Thus, we can help to decrease the information loss during the anonymization process. The HMPFSOM approach proposes to study the distribution of confidential attributes within each sub-dataset. Then, according to the latter distribution, the privacy parameter k is determined, in such a way to preserve the diversity of confidential attributes within the anonymized microdata. This allows to decrease the disclosure risk of confidential information.


page 1

page 2

page 3

page 4


A Brief Study of Privacy-Preserving Practices (PPP) in Data Mining

Data mining is the way toward mining fascinating patterns or information...

HyObscure: Hybrid Obscuring for Privacy-Preserving Data Publishing

Minimizing privacy leakage while ensuring data utility is a critical pro...

A general cipher for individual data anonymization

Over the years, the literature on individual data anonymization has burg...

Adversarial Learning of Privacy-Preserving and Task-Oriented Representations

Data privacy has emerged as an important issue as data-driven deep learn...

Marginality: a numerical mapping for enhanced treatment of nominal and hierarchical attributes

The purpose of statistical disclosure control (SDC) of microdata, a.k.a....

Mining Privacy-Preserving Association Rules based on Parallel Processing in Cloud Computing

With the onset of the Information Era and the rapid growth of informatio...

Privacy-Preserving Data Publishing via Mutual Cover

We study anonymization techniques for preserving privacy in the publicat...

Please sign up or login with your details

Forgot password? Click here to reset