Watermarking Pre-trained Language Models with Backdooring

10/14/2022
by   Chenxi Gu, et al.
0

Large pre-trained language models (PLMs) have proven to be a crucial component of modern natural language processing systems. PLMs typically need to be fine-tuned on task-specific downstream datasets, which makes it hard to claim the ownership of PLMs and protect the developer's intellectual property due to the catastrophic forgetting phenomenon. We show that PLMs can be watermarked with a multi-task learning framework by embedding backdoors triggered by specific inputs defined by the owners, and those watermarks are hard to remove even though the watermarked PLMs are fine-tuned on multiple downstream tasks. In addition to using some rare words as triggers, we also show that the combination of common words can be used as backdoor triggers to avoid them being easily detected. Extensive experiments on multiple datasets demonstrate that the embedded watermarks can be robustly extracted with a high success rate and less influenced by the follow-up fine-tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2023

Preserving Pre-trained Features Helps Calibrate Fine-tuned Language Models

Large pre-trained language models (PLMs) have demonstrated strong perfor...
research
11/08/2019

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

Transfer learning has fundamentally changed the landscape of natural lan...
research
06/12/2022

Fine-tuning Pre-trained Language Models with Noise Stability Regularization

The advent of large-scale pre-trained language models has contributed gr...
research
07/01/2019

Pentagon at MEDIQA 2019: Multi-task Learning for Filtering and Re-ranking Answers using Language Inference and Question Entailment

Parallel deep learning architectures like fine-tuned BERT and MT-DNN, ha...
research
11/10/2022

ADEPT: A DEbiasing PrompT Framework

Several works have proven that finetuning is an applicable approach for ...
research
08/25/2023

Fine-tuning can cripple your foundation model; preserving features may be the solution

Pre-trained foundation models, owing primarily to their enormous capacit...
research
05/02/2022

ASTROMER: A transformer-based embedding for the representation of light curves

Taking inspiration from natural language embeddings, we present ASTROMER...

Please sign up or login with your details

Forgot password? Click here to reset