Privacy Leakage in Text Classification: A Data Extraction Approach

06/09/2022
by   Adel Elmahdy, et al.
7

Recent work has demonstrated the successful extraction of training data from generative language models. However, it is not evident whether such extraction is feasible in text classification models since the training objective is to predict the class label as opposed to next-word prediction. This poses an interesting challenge and raises an important question regarding the privacy of training data in text classification settings. Therefore, we study the potential privacy leakage in the text classification domain by investigating the problem of unintended memorization of training data that is not pertinent to the learning task. We propose an algorithm to extract missing tokens of a partial text by exploiting the likelihood of the class label provided by the model. We test the effectiveness of our algorithm by inserting canaries into the training set and attempting to extract tokens in these canaries post-training. In our experiments, we demonstrate that successful extraction is possible to some extent. This can also be used as an auditing strategy to assess any potential unauthorized use of personal data without consent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/09/2023

Bag of Tricks for Training Data Extraction from Language Models

With the advance of language models, privacy protection is receiving mor...
research
03/04/2021

On the privacy-utility trade-off in differentially private hierarchical text classification

Hierarchical models for text classification can leak sensitive or confid...
research
05/23/2022

Many-Class Text Classification with Matching

In this work, we formulate Text Classification as a Matching problem bet...
research
12/15/2022

Improve Text Classification Accuracy with Intent Information

Text classification, a core component of task-oriented dialogue systems,...
research
01/14/2021

Privacy Analysis in Language Models via Training Data Leakage Report

Recent advances in neural network based language models lead to successf...
research
10/21/2022

Robustifying Sentiment Classification by Maximally Exploiting Few Counterfactuals

For text classification tasks, finetuned language models perform remarka...
research
11/30/2022

Learning Label Modular Prompts for Text Classification in the Wild

Machine learning models usually assume i.i.d data during training and te...

Please sign up or login with your details

Forgot password? Click here to reset