Deletion Inference, Reconstruction, and Compliance in Machine (Un)Learning

by   Ji Gao, et al.

Privacy attacks on machine learning models aim to identify the data that is used to train such models. Such attacks, traditionally, are studied on static models that are trained once and are accessible by the adversary. Motivated to meet new legal requirements, many machine learning methods are recently extended to support machine unlearning, i.e., updating models as if certain examples are removed from their training sets, and meet new legal requirements. However, privacy attacks could potentially become more devastating in this new setting, since an attacker could now access both the original model before deletion and the new model after the deletion. In fact, the very act of deletion might make the deleted record more vulnerable to privacy attacks. Inspired by cryptographic definitions and the differential privacy framework, we formally study privacy implications of machine unlearning. We formalize (various forms of) deletion inference and deletion reconstruction attacks, in which the adversary aims to either identify which record is deleted or to reconstruct (perhaps part of) the deleted records. We then present successful deletion inference and reconstruction attacks for a variety of machine learning models and tasks such as classification, regression, and language models. Finally, we show that our attacks would provably be precluded if the schemes satisfy (variants of) Deletion Compliance (Garg, Goldwasser, and Vasudevan, Eurocrypt' 20).


page 1

page 2

page 3

page 4


Forget Unlearning: Towards True Data-Deletion in Machine Learning

Unlearning has emerged as a technique to efficiently erase information o...

Privacy Risks of Explaining Machine Learning Models

Can we trust black-box machine learning with its decisions? Can we trust...

Deletion-Compliance in the Absence of Privacy

Garg, Goldwasser and Vasudevan (Eurocrypt 2020) invented the notion of d...

ACE: A Consent-Embedded privacy-preserving search on genomic database

In this paper, we introduce ACE, a consent-embedded searchable encryptio...

ExD: Explainable Deletion

This paper focuses on a critical yet often overlooked aspect of data in ...

DeltaGrad: Rapid retraining of machine learning models

Machine learning models are not static and may need to be retrained on s...

On Inferring Training Data Attributes in Machine Learning Models

A number of recent works have demonstrated that API access to machine le...

Please sign up or login with your details

Forgot password? Click here to reset