Amnesiac Machine Learning

10/21/2020
by   Laura Graves, et al.
0

The Right to be Forgotten is part of the recently enacted General Data Protection Regulation (GDPR) law that affects any data holder that has data on European Union residents. It gives EU residents the ability to request deletion of their personal data, including training records used to train machine learning models. Unfortunately, Deep Neural Network models are vulnerable to information leaking attacks such as model inversion attacks which extract class information from a trained model and membership inference attacks which determine the presence of an example in a model's training data. If a malicious party can mount an attack and learn private information that was meant to be removed, then it implies that the model owner has not properly protected their user's rights and their models may not be compliant with the GDPR law. In this paper, we present two efficient methods that address this question of how a model owner or data holder may delete personal data from models in such a way that they may not be vulnerable to model inversion and membership inference attacks while maintaining model efficacy. We start by presenting a real-world threat model that shows that simply removing training data is insufficient to protect users. We follow that up with two data removal methods, namely Unlearning and Amnesiac Unlearning, that enable model owners to protect themselves against such attacks while being compliant with regulations. We provide extensive empirical analysis that show that these methods are indeed efficient, safe to apply, effectively remove learned information about sensitive data from trained models while maintaining model efficacy.

READ FULL TEXT

page 6

page 7

research
11/22/2021

Machine unlearning via GAN

Machine learning models, especially deep models, may unintentionally rem...
research
07/12/2018

Algorithms that Remember: Model Inversion Attacks and Data Protection Law

Many individuals are concerned about the governance of machine learning ...
research
09/18/2021

Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Model Training

In this paper, we raise up an emerging personal data protection problem ...
research
12/07/2018

Reaching Data Confidentiality and Model Accountability on the CalTrain

Distributed collaborative learning (DCL) paradigms enable building joint...
research
10/06/2021

How BPE Affects Memorization in Transformers

Training data memorization in NLP can both be beneficial (e.g., closed-b...
research
10/15/2021

Hand Me Your PIN! Inferring ATM PINs of Users Typing with a Covered Hand

Automated Teller Machines (ATMs) represent the most used system for with...
research
10/04/2022

Certified Data Removal in Sum-Product Networks

Data protection regulations like the GDPR or the California Consumer Pri...

Please sign up or login with your details

Forgot password? Click here to reset