KGA: A General Machine Unlearning Framework Based on Knowledge Gap Alignment

05/11/2023
by   Lingzhi Wang, et al.
0

Recent legislation of the "right to be forgotten" has led to the interest in machine unlearning, where the learned models are endowed with the function to forget information about specific training instances as if they have never existed in the training set. Previous work mainly focuses on computer vision scenarios and largely ignores the essentials of unlearning in NLP field, where text data contains more explicit and sensitive personal information than images. In this paper, we propose a general unlearning framework called KGA to induce forgetfulness. Different from previous work that tries to recover gradients or forces models to perform close to one specific distribution, KGA maintains distribution differences (i.e., knowledge gap). This relaxes the distribution assumption. Furthermore, we first apply the unlearning method to various NLP tasks (i.e., classification, translation, response generation) and propose several unlearning evaluation metrics with pertinence. Experiments on large-scale datasets show that KGA yields comprehensive improvements over baselines, where extensive analyses further validate the effectiveness of KGA and provide insight into unlearning for NLP tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/21/2022

KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Few-Shot NLP

This paper focuses on text data augmentation for few-shot NLP tasks. The...
research
01/02/2021

Substructure Substitution: Structured Data Augmentation for NLP

We study a family of data augmentation methods, substructure substitutio...
research
06/18/2023

MISMATCH: Fine-grained Evaluation of Machine-generated Text with Mismatch Error Types

With the growing interest in large language models, the need for evaluat...
research
06/12/2023

Gradient Ascent Post-training Enhances Language Model Generalization

In this work, we empirically show that updating pretrained LMs (350M, 1....
research
10/15/2021

Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining

To explain NLP models, many methods inform which inputs tokens are impor...
research
08/29/2019

Learning Latent Parameters without Human Response Patterns: Item Response Theory with Artificial Crowds

Incorporating Item Response Theory (IRT) into NLP tasks can provide valu...
research
09/13/2023

How (Not) to Use Sociodemographic Information for Subjective NLP Tasks

Annotators' sociodemographic backgrounds (i.e., the individual compositi...

Please sign up or login with your details

Forgot password? Click here to reset