Attacks on Deidentification's Defenses

02/27/2022
by   Aloni Cohen, et al.
0

Quasi-identifier-based deidentification techniques (QI-deidentification) are widely used in practice, including k-anonymity, ℓ-diversity, and t-closeness. We present three new attacks on QI-deidentification: two theoretical attacks and one practical attack on a real dataset. In contrast to prior work, our theoretical attacks work even if every attribute is a quasi-identifier. Hence, they apply to k-anonymity, ℓ-diversity, t-closeness, and most other QI-deidentification techniques. First, we introduce a new class of privacy attacks called downcoding attacks, and prove that every QI-deidentification scheme is vulnerable to downcoding attacks if it is minimal and hierarchical. Second, we convert the downcoding attacks into powerful predicate singling-out (PSO) attacks, which were recently proposed as a way to demonstrate that a privacy mechanism fails to legally anonymize under Europe's General Data Protection Regulation. Third, we use LinkedIn.com to reidentify 3 students in a k-anonymized dataset published by EdX (and show thousands are potentially vulnerable), undermining EdX's claimed compliance with the Family Educational Rights and Privacy Act. The significance of this work is both scientific and political. Our theoretical attacks demonstrate that QI-deidentification may offer no protection even if every attribute is treated as a quasi-identifier. Our practical attack demonstrates that even deidentification experts acting in accordance with strict privacy regulations fail to prevent real-world reidentification. Together, they rebut a foundational tenet of QI-deidentification and challenge the actual arguments made to justify the continued use of k-anonymity and other QI-deidentification techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2022

Leveraging Algorithmic Fairness to Mitigate Blackbox Attribute Inference Attacks

Machine learning (ML) models have been deployed for high-stakes applicat...
research
05/12/2023

Comparison of machine learning models applied on anonymized data with different techniques

Anonymization techniques based on obfuscating the quasi-identifiers by m...
research
07/26/2020

Anonymizing Machine Learning Models

There is a known tension between the need to analyze personal data to dr...
research
06/24/2019

AnonTokens: tracing re-identification attacks through decoy records

Privacy is of the utmost concern when it comes to releasing data to thir...
research
05/13/2018

AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning

Users in various web and mobile applications are vulnerable to attribute...
research
11/20/2021

You Overtrust Your Printer

Printers are common devices whose networked use is vastly unsecured, per...
research
08/16/2022

pyCANON: A Python library to check the level of anonymity of a dataset

Openly sharing data with sensitive attributes and privacy restrictions i...

Please sign up or login with your details

Forgot password? Click here to reset