Deconstructing Classifiers: Towards A Data Reconstruction Attack Against Text Classification Models

06/23/2023
by   Adel Elmahdy, et al.
0

Natural language processing (NLP) models have become increasingly popular in real-world applications, such as text classification. However, they are vulnerable to privacy attacks, including data reconstruction attacks that aim to extract the data used to train the model. Most previous studies on data reconstruction attacks have focused on LLM, while classification models were assumed to be more secure. In this work, we propose a new targeted data reconstruction attack called the Mix And Match attack, which takes advantage of the fact that most classification models are based on LLM. The Mix And Match attack uses the base model of the target model to generate candidate tokens and then prunes them using the classification head. We extensively demonstrate the effectiveness of the attack using both random and organic canaries. This work highlights the importance of considering the privacy risks associated with data reconstruction attacks in classification models and offers insights into possible leakages.

READ FULL TEXT

page 10

page 11

page 12

page 13

page 14

research
01/09/2022

Rethink Stealthy Backdoor Attacks in Natural Language Processing

Recently, it has been shown that natural language processing (NLP) model...
research
09/21/2022

Text Revealer: Private Text Reconstruction via Model Inversion Attacks against Transformers

Text classification has become widely used in various natural language p...
research
01/31/2020

Benchmarking Popular Classification Models' Robustness to Random and Targeted Corruptions

Text classification models, especially neural networks based models, hav...
research
06/01/2020

BadNL: Backdoor Attacks Against NLP Models

Machine learning (ML) has progressed rapidly during the past decade and ...
research
10/06/2020

Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder

This paper demonstrates a fatal vulnerability in natural language infere...
research
08/04/2022

Privacy Safe Representation Learning via Frequency Filtering Encoder

Deep learning models are increasingly deployed in real-world application...
research
10/22/2019

Understanding the Effects of Real-World Behavior in Statistical Disclosure Attacks

High-latency anonymous communication systems prevent passive eavesdroppe...

Please sign up or login with your details

Forgot password? Click here to reset