DeepAI AI Chat
Log In Sign Up

The Hidden Vulnerability of Watermarking for Deep Neural Networks

by   Shangwei Guo, et al.

Watermarking has shown its effectiveness in protecting the intellectual property of Deep Neural Networks (DNNs). Existing techniques usually embed a set of carefully-crafted sample-label pairs into the target model during the training process. Then ownership verification is performed by querying a suspicious model with those watermark samples and checking the prediction results. These watermarking solutions claim to be robustness against model transformations, which is challenged by this paper. We design a novel watermark removal attack, which can defeat state-of-the-art solutions without any prior knowledge of the adopted watermarking technique and training samples. We make two contributions in the design of this attack. First, we propose a novel preprocessing function, which embeds imperceptible patterns and performs spatial-level transformations over the input. This function can make the watermark sample unrecognizable by the watermarked model, while still maintaining the correct prediction results of normal samples. Second, we introduce a fine-tuning strategy using unlabelled and out-of-distribution samples, which can improve the model usability in an efficient manner. Extensive experimental results indicate that our proposed attack can effectively bypass existing watermarking solutions with very high success rates.


BATT: Backdoor Attack with Transformation-based Triggers

Deep neural networks (DNNs) are vulnerable to backdoor attacks. The back...

Backdoor Attack with Sample-Specific Triggers

Recently, backdoor attacks pose a new security threat to the training pr...

Don't Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks

Backdoor attack against deep neural networks is currently being profound...

Hidden Backdoor Attack against Semantic Segmentation Models

Deep neural networks (DNNs) are vulnerable to the backdoor attack, which...

Imperceptible Backdoor Attack: From Input Space to Feature Representation

Backdoor attacks are rapidly emerging threats to deep neural networks (D...

PEEL: A Provable Removal Attack on Deep Hiding

Deep hiding, embedding images into another using deep neural networks, h...

Verifying Integrity of Deep Ensemble Models by Lossless Black-box Watermarking with Sensitive Samples

With the widespread use of deep neural networks (DNNs) in many areas, mo...