Learning to Unlearn: Instance-wise Unlearning for Pre-trained Classifiers

by   Sungmin Cha, et al.

Since the recent advent of regulations for data protection (e.g., the General Data Protection Regulation), there has been increasing demand in deleting information learned from sensitive data in pre-trained models without retraining from scratch. The inherent vulnerability of neural networks towards adversarial attacks and unfairness also calls for a robust method to remove or correct information in an instance-wise fashion, while retaining the predictive performance across remaining data. To this end, we define instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model, by either misclassifying each instance away from its original prediction or relabeling the instance to a different label. We also propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information. Both methods only require the pre-trained model and data instances to forget, allowing painless application to real-life settings where the entire training set is unavailable. Through extensive experimentation on various image classification benchmarks, we show that our approach effectively preserves knowledge of remaining data while unlearning given instances in both single-task and continual unlearning scenarios.


page 9

page 17


K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters

We study the problem of injecting knowledge into large pre-trained model...

Cost-Effective Training of Deep CNNs with Active Model Adaptation

Deep convolutional neural networks have achieved great success in variou...

TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization

Recent years have seen the ever-increasing importance of pre-trained mod...

Exploring Visual Prompts for Whole Slide Image Classification with Multiple Instance Learning

Multiple instance learning (MIL) has emerged as a popular method for cla...

SecretGen: Privacy Recovery on Pre-Trained Models via Distribution Discrimination

Transfer learning through the use of pre-trained models has become a gro...

S2OSC: A Holistic Semi-Supervised Approach for Open Set Classification

Open set classification (OSC) tackles the problem of determining whether...

Learning with Recoverable Forgetting

Life-long learning aims at learning a sequence of tasks without forgetti...

Please sign up or login with your details

Forgot password? Click here to reset