Evaluating the Robustness of Trigger Set-Based Watermarks Embedded in Deep Neural Networks

06/18/2021
by   Suyoung Lee, et al.
0

Trigger set-based watermarking schemes have gained emerging attention as they provide a means to prove ownership for deep neural network model owners. In this paper, we argue that state-of-the-art trigger set-based watermarking algorithms do not achieve their designed goal of proving ownership. We posit that this impaired capability stems from two common experimental flaws that the existing research practice has committed when evaluating the robustness of watermarking algorithms: (1) incomplete adversarial evaluation and (2) overlooked adaptive attacks. We conduct a comprehensive adversarial evaluation of 10 representative watermarking schemes against six of the existing attacks and demonstrate that each of these watermarking schemes lacks robustness against at least two attacks. We also propose novel adaptive attacks that harness the adversary's knowledge of the underlying watermarking algorithm of a target model. We demonstrate that the proposed attacks effectively break all of the 10 watermarking schemes, consequently allowing adversaries to obscure the ownership of any watermarked model. We encourage follow-up studies to consider our guidelines when evaluating the robustness of their watermarking schemes via conducting comprehensive adversarial evaluation that include our adaptive attacks to demonstrate a meaningful upper bound of watermark robustness.

READ FULL TEXT

page 8

page 9

page 10

page 12

page 19

page 23

page 24

research
08/11/2021

SoK: How Robust is Image Classification Deep Neural Network Watermarking? (Extended Version)

Deep Neural Network (DNN) watermarking is a method for provenance verifi...
research
06/07/2022

Adaptive Regularization for Adversarial Training

Adversarial training, which is to enhance robustness against adversarial...
research
05/10/2019

Interpreting and Evaluating Neural Network Robustness

Recently, adversarial deception becomes one of the most considerable thr...
research
01/13/2021

Robustness Gym: Unifying the NLP Evaluation Landscape

Despite impressive performance on standard benchmarks, deep neural netwo...
research
10/08/2021

Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks

We propose a novel framework to evaluate the robustness of transformer-b...
research
06/14/2021

Design and Experimental Assessment of Detection Schemes for Air Interface Attacks in Adverse Scenarios

In this letter, we propose three schemes designed to detect attacks over...
research
04/22/2021

Performance Evaluation of Adversarial Attacks: Discrepancies and Solutions

Recently, adversarial attack methods have been developed to challenge th...

Please sign up or login with your details

Forgot password? Click here to reset