Active Data Pattern Extraction Attacks on Generative Language Models

07/14/2022
by   Bargav Jayaraman, et al.
0

With the wide availability of large pre-trained language model checkpoints, such as GPT-2 and BERT, the recent trend has been to fine-tune them on a downstream task to achieve the state-of-the-art performance with a small computation overhead. One natural example is the Smart Reply application where a pre-trained model is fine-tuned for suggesting a number of responses given a query message. In this work, we set out to investigate potential information leakage vulnerabilities in a typical Smart Reply pipeline and show that it is possible for an adversary, having black-box or gray-box access to a Smart Reply model, to extract sensitive user information present in the training data. We further analyse the privacy impact of specific components, e.g. the decoding strategy, pertained to this application through our attack settings. We explore potential mitigation strategies and demonstrate how differential privacy can be a strong defense mechanism to such data extraction attacks.

READ FULL TEXT
research
10/18/2022

Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models

Deep Neural Networks (DNNs) are known to be vulnerable to backdoor attac...
research
05/26/2022

Differentially Private Decoding in Large Language Models

Recent large-scale natural language processing (NLP) systems use a pre-t...
research
05/23/2021

Killing Two Birds with One Stone: Stealing Model and Inferring Attribute from BERT-based APIs

The advances in pre-trained models (e.g., BERT, XLNET and etc) have larg...
research
03/18/2021

Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!

Natural language processing (NLP) tasks, ranging from text classificatio...
research
07/28/2021

An Evaluation of Generative Pre-Training Model-based Therapy Chatbot for Caregivers

With the advent of off-the-shelf intelligent home products and broader i...
research
07/19/2023

What can we learn from Data Leakage and Unlearning for Law?

Large Language Models (LLMs) have a privacy concern because they memoriz...
research
09/14/2023

Do Not Give Away My Secrets: Uncovering the Privacy Issue of Neural Code Completion Tools

Neural Code Completion Tools (NCCTs) have reshaped the field of software...

Please sign up or login with your details

Forgot password? Click here to reset