DeepHider: A Multi-module and Invisibility Watermarking Scheme for Language Model

08/09/2022
by   Long Dai, et al.
0

Natural language processing (NLP) technology has shown great economic value in business. However, a natural language processing model faces two problems: (1) the owner's models of NLP are vulnerable to the threat of pirated redistribution, which breaks the symmetry relation between model owners and consumers; (2) a stealer may replace the classification module for a watermarked model to satisfy his specific classification task, and remove the watermark existing in the model. For the first problem, a model-protection mechanism is needed to keep the symmetry from being broken. Currently, language model protection schemes based on black-box verification are easily detected by humans or anomaly detectors, thus preventing verification. To address this issue, the paper proposes a trigger sample set with triggerless mode. For the second problem, this paper proposes a new threat, which is to replace the model classification module and perform global fine-tuning on the model, and verifies the model ownership through a white-box approach. Meanwhile, we use the features of blockchain such as tamper-proof and traceability to prevent the ownership statement of stealers. Experiments show that the proposed scheme successfully verifies ownership with 100 without affecting the original performance of the model, and has strong robustness and low False trigger rate.

READ FULL TEXT
research
10/03/2022

An Embarrassingly Simple Approach for Intellectual Property Rights Protection on Recurrent Neural Networks

Capitalise on deep learning models, offering Natural Language Processing...
research
09/07/2021

Sequential Attention Module for Natural Language Processing

Recently, large pre-trained neural language models have attained remarka...
research
06/06/2022

PCPT and ACPT: Copyright Protection and Traceability Scheme for DNN Model

Deep neural networks (DNNs) have achieved tremendous success in artifici...
research
01/11/2020

A Continuous Space Neural Language Model for Bengali Language

Language models are generally employed to estimate the probability distr...
research
10/27/2020

Interpretation of NLP models through input marginalization

To demystify the "black box" property of deep neural networks for natura...
research
06/02/2021

belabBERT: a Dutch RoBERTa-based language model applied to psychiatric classification

Natural language processing (NLP) is becoming an important means for aut...

Please sign up or login with your details

Forgot password? Click here to reset