Certifiable Machine Unlearning for Linear Models

06/29/2021
by   Ananth Mahadevan, et al.
0

Machine unlearning is the task of updating machine learning (ML) models after a subset of the training data they were trained on is deleted. Methods for the task are desired to combine effectiveness and efficiency, i.e., they should effectively "unlearn" deleted data, but in a way that does not require excessive computation effort (e.g., a full retraining) for a small amount of deletions. Such a combination is typically achieved by tolerating some amount of approximation in the unlearning. In addition, laws and regulations in the spirit of "the right to be forgotten" have given rise to requirements for certifiability, i.e., the ability to demonstrate that the deleted data has indeed been unlearned by the ML model. In this paper, we present an experimental study of the three state-of-the-art approximate unlearning methods for linear models and demonstrate the trade-offs between efficiency, effectiveness and certifiability offered by each method. In implementing the study, we extend some of the existing works and describe a common ML pipeline to compare and evaluate the unlearning methods on six real-world datasets and a variety of settings. We provide insights into the effect of the quantity and distribution of the deleted data on ML models and the performance of each unlearning method in different settings. We also propose a practical online strategy to determine when the accumulated error from approximate unlearning is large enough to warrant a full retrain of the ML model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2020

Regulating Accuracy-Efficiency Trade-Offs in Distributed Machine Learning Systems

In this paper we discuss the trade-off between accuracy and efficiency i...
research
04/04/2022

Using Explainable Boosting Machine to Compare Idiographic and Nomothetic Approaches for Ecological Momentary Assessment Data

Previous research on EMA data of mental disorders was mainly focused on ...
research
05/28/2021

Data Acquisition for Improving Machine Learning Models

The vast advances in Machine Learning over the last ten years have been ...
research
03/15/2022

Igeood: An Information Geometry Approach to Out-of-Distribution Detection

Reliable out-of-distribution (OOD) detection is fundamental to implement...
research
04/13/2022

Joint Coreset Construction and Quantization for Distributed Machine Learning

Coresets are small, weighted summaries of larger datasets, aiming at pro...
research
11/28/2022

The Grind for Good Data: Understanding ML Practitioners' Struggles and Aspirations in Making Good Data

We thought data to be simply given, but reality tells otherwise; it is c...
research
03/27/2021

Graph Unlearning

The right to be forgotten states that a data subject has the right to er...

Please sign up or login with your details

Forgot password? Click here to reset