Markov Chain Monte Carlo-Based Machine Unlearning: Unlearning What Needs to be Forgotten

02/28/2022
by   Quoc Phong Nguyen, et al.
0

As the use of machine learning (ML) models is becoming increasingly popular in many real-world applications, there are practical challenges that need to be addressed for model maintenance. One such challenge is to 'undo' the effect of a specific subset of dataset used for training a model. This specific subset may contain malicious or adversarial data injected by an attacker, which affects the model performance. Another reason may be the need for a service provider to remove data pertaining to a specific user to respect the user's privacy. In both cases, the problem is to 'unlearn' a specific subset of the training data from a trained model without incurring the costly procedure of retraining the whole model from scratch. Towards this goal, this paper presents a Markov chain Monte Carlo-based machine unlearning (MCU) algorithm. MCU helps to effectively and efficiently unlearn a trained model from subsets of training dataset. Furthermore, we show that with MCU, we are able to explain the effect of a subset of a training dataset on the model prediction. Thus, MCU is useful for examining subsets of data to identify the adversarial data to be removed. Similarly, MCU can be used to erase the lineage of a user's personal data from trained ML models, thus upholding a user's "right to be forgotten". We empirically evaluate the performance of our proposed MCU algorithm on real-world phishing and diabetes datasets. Results show that MCU can achieve a desirable performance by efficiently removing the effect of a subset of training dataset and outperform an existing algorithm that utilizes the remaining dataset.

READ FULL TEXT
research
07/13/2016

Ensemble preconditioning for Markov chain Monte Carlo simulation

We describe parallel Markov chain Monte Carlo methods that propagate a c...
research
01/29/2019

Differentially Private Markov Chain Monte Carlo

Recent developments in differentially private (DP) machine learning and ...
research
07/30/2021

High Performance Uncertainty Quantification with Parallelized Multilevel Markov Chain Monte Carlo

Numerical models of complex real-world phenomena often necessitate High ...
research
06/30/2020

Preconditioning Markov Chain Monte Carlo Method for Geomechanical Subsidence using multiscale method and machine learning technique

In this paper, we consider the numerical solution of the poroelasticity ...
research
10/05/2019

Characterizing Membership Privacy in Stochastic Gradient Langevin Dynamics

Bayesian deep learning is recently regarded as an intrinsic way to chara...
research
01/20/2023

Fair Credit Scorer through Bayesian Approach

Machine learning currently plays an increasingly important role in peopl...
research
12/31/2020

Coded Machine Unlearning

Models trained in machine learning processes may store information about...

Please sign up or login with your details

Forgot password? Click here to reset