pyCANON: A Python library to check the level of anonymity of a dataset

08/16/2022
by   Judith Sainz-Pardo Díaz, et al.
0

Openly sharing data with sensitive attributes and privacy restrictions is a challenging task. In this document we present the implementation of pyCANON, a Python library and command line interface (CLI) to check and assess the level of anonymity of a dataset through some of the most common anonymization techniques: k-anonymity, (α,k)-anonymity, ℓ-diversity, entropy ℓ-diversity, recursive (c,ℓ)-diversity, basic β-likeness, enhanced β-likeness, t-closeness and δ-disclosure privacy. For the case of more than one sensitive attributes, two approaches are proposed for evaluating this techniques. The main strength of this library is to obtain a full report of the parameters that are fulfilled for each of the techniques mentioned above, with the unique requirement of the set of quasi-identifiers and that of sensitive attributes. We present the methods implemented together with the attacks they prevent, the description of the library, use examples of the different functions, as well as the impact and the possible applications that can be developed. Finally, some possible aspects to be incorporated in future updates are proposed.

READ FULL TEXT
research
12/17/2019

srlearn: A Python Library for Gradient-Boosted Statistical Relational Models

We present srlearn, a Python library for boosted statistical relational ...
research
07/03/2019

libconform v0.1.0: a Python library for conformal prediction

This paper introduces libconform v0.1.0, a Python library for the confor...
research
05/12/2023

Comparison of machine learning models applied on anonymized data with different techniques

Anonymization techniques based on obfuscating the quasi-identifiers by m...
research
06/08/2018

Blind Justice: Fairness with Encrypted Sensitive Attributes

Recent work has explored how to train machine learning models which do n...
research
02/27/2022

Attacks on Deidentification's Defenses

Quasi-identifier-based deidentification techniques (QI-deidentification)...
research
12/08/2022

DeeProb-kit: a Python Library for Deep Probabilistic Modelling

DeeProb-kit is a unified library written in Python consisting of a colle...

Please sign up or login with your details

Forgot password? Click here to reset