PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification

08/05/2023
by   Hongwei Yao, et al.
0

Large language models (LLMs) have witnessed a meteoric rise in popularity among the general public users over the past few months, facilitating diverse downstream tasks with human-level accuracy and proficiency. Prompts play an essential role in this success, which efficiently adapt pre-trained LLMs to task-specific applications by simply prepending a sequence of tokens to the query texts. However, designing and selecting an optimal prompt can be both expensive and demanding, leading to the emergence of Prompt-as-a-Service providers who profit by providing well-designed prompts for authorized use. With the growing popularity of prompts and their indispensable role in LLM-based services, there is an urgent need to protect the copyright of prompts against unauthorized use. In this paper, we propose PromptCARE, the first framework for prompt copyright protection through watermark injection and verification. Prompt watermarking presents unique challenges that render existing watermarking techniques developed for model and dataset copyright verification ineffective. PromptCARE overcomes these hurdles by proposing watermark injection and verification schemes tailor-made for prompts and NLP characteristics. Extensive experiments on six well-known benchmark datasets, using three prevalent pre-trained LLMs (BERT, RoBERTa, and Facebook OPT-1.3b), demonstrate the effectiveness, harmlessness, robustness, and stealthiness of PromptCARE.

READ FULL TEXT
research
05/28/2023

Plug-and-Play Knowledge Injection for Pre-trained Language Models

Injecting external knowledge can improve the performance of pre-trained ...
research
06/13/2021

Non-Transferable Learning: A New Approach for Model Verification and Authorization

As Artificial Intelligence as a Service gains popularity, protecting wel...
research
03/27/2023

Typhoon: Towards an Effective Task-Specific Masking Strategy for Pre-trained Language Models

Through exploiting a high level of parallelism enabled by graphics proce...
research
09/11/2023

CrisisTransformers: Pre-trained language models and sentence encoders for crisis-related social media texts

Social media platforms play an essential role in crisis communication, b...
research
06/11/2023

QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search

In light of the success of the pre-trained language models (PLMs), conti...
research
03/09/2020

Towards Probabilistic Verification of Machine Unlearning

Right to be forgotten, also known as the right to erasure, is the right ...

Please sign up or login with your details

Forgot password? Click here to reset