Rethinking Deep Neural Network Ownership Verification: Embedding Passports to Defeat Ambiguity Attacks

09/16/2019 ∙ by Lixin Fan, et al. ∙ University of Malaya 4

With the rapid development of deep neural networks (DNN), there emerges an urgent need to protect the trained DNN models from being illegally copied, redistributed, or abused without respecting the intellectual properties of legitimate owners. Following recent progresses along this line, we investigate a number of watermark-based DNN ownership verification methods in the face of ambiguity attacks, which aim to cast doubts on ownership verification by forging counterfeit watermarks. It is shown that ambiguity attacks pose serious challenges to existing DNN watermarking methods. As remedies to the above-mentioned loophole, this paper proposes novel passport-based DNN ownership verification schemes which are both robust to network modifications and resilient to ambiguity attacks. The gist of embedding digital passports is to design and train DNN models in a way such that, the DNN model performance of an original task will be significantly deteriorated due to forged passports. In other words genuine passports are not only verified by looking for predefined signatures, but also reasserted by the unyielding DNN model performances. Extensive experimental results justify the effectiveness of the proposed passport-based DNN ownership verification schemes. Code and models are available at https://github.com/kamwoh/DeepIPR

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

With the rapid development of deep neural networks (DNN), Machine Learning as a Service (MLaaS) has emerged as a viable and lucrative business model. As a result of this, there is an urgent need to protect trained DNN models from being illegally copied, redistributed or abused (i.e. intellectual property infringement). Recently, for instance, digital

watermarking techniques have been adopted to provide such protection, by embedding watermarks into DNN models during the training stage. Subsequently, ownerships of these DNN models are verified by the detection of the embedded watermarks, which are supposed to be robust to multiple types of modifications such as fine-tuning, pruning and watermark overwriting EmbedWMDNN_2017arXiv ; TurnWeakStrength_Adi2018arXiv ; DeepMarks_2018arXiv ; ProtectIPDNN_Zhang2018 .

In terms of machine learning methods adopted to embed watermarks, existing approaches can be broadly categorized into two schools: a) the feature-based methods that embed designated watermarks into the network weights by imposing additional regularization terms EmbedWMDNN_2017arXiv ; DeepMarks_2018arXiv ; DeepSigns_2018arXiv ; and b) the trigger-set based methods that rely on adversarial training samples with specific labels (i.e. backdoor trigger sets) TurnWeakStrength_Adi2018arXiv ; ProtectIPDNN_Zhang2018 . Watermarks embedded with either of these methods have successfully demonstrated robustness against removal attacks which involve modifications of the network weights such as fine-tuning or pruning. However, our studies disclose the existence and effectiveness of ambiguity attacks which aim to cast doubt on the ownership verification by forging additional watermarks for DNN models in question (see Fig. 1). We also show that it is always possible to reverse-engineer forged watermarks at minor computational cost even though the original training datasets is not needed (Section 2).

As remedies to the above-mentioned loophole, this paper proposes a novel passport-based approach. A unique advantage of our proposed embedded passports over watermarks lies in the feature that the performance of a pre-trained network might either remain intact given the presence of valid passports, or be significantly deteriorated due to the modified or forged passports. In other words we propose to modulate the performances of the DNN model depending on the presented passports, and by doing so, one can develop ownership verification schemes that are both robust to removal attacks and resilient to ambiguity attacks (Section 3).

The contributions of our work are threefold: i) we put forth a general formulation of DNN ownership verification schemes and, empirically, we show that existing DNN watermarking methods are vulnerable to ambiguity attacks; ii) we propose novel passport-based verification schemes and demonstrate with extensive experiment results that these schemes successfully defeat ambiguity attacks and compliment watermarking paradigm such as trigger-set based methods; iii) methodology-wise, the proposed modulation of network performance based on presented passports (Eq. 4) is novel and plays an indispensable role in bringing network behaviors under control against adversarial attacks.

Figure 1: DNN model ownership verification in the face of ambiguity attacks. Left: Owner Alice uses an embedding process to train networks with watermarks () and releases the network publicly available; Attacker Bob forges fake watermarks () with an invert process ; Middle: The ownership is in doubt since both the original and forged watermarks are detected by the verification process (Sect. 2.2); Right: The ambiguity is resolved when passports are embedded and the network performances are evaluated in favor of the original passport by the fidelity evaluation process (See Defintion 1 and Sect. 3.3).

1.1 Related work

Uchida et. al EmbedWMDNN_2017arXiv

was probably the first work that proposed to embed watermarks into DNN models by imposing an additional

regularization term on the weights parameters. TurnWeakStrength_Adi2018arXiv ; AdStitch_2017arXiv proposed to embed watermarks in the classification labels of adversarial examples in a trigger set, so that the watermarks can be extracted remotely through a service API without the need to access the network weights (i.e. black-box setting). Also in both black-box and white box settings, DeepSigns_2018arXiv ; DeepMarks_2018arXiv ; 8587745 demonstrated how to embed watermarks (or fingerprints) that are robust to various types of attacks. In particular, it was shown that embedded watermarks are in general robust to removal attacks that modify network weights via fine-tuning or pruning. Watermark overwriting, on the other hand, is more problematic since it aims to simultaneously embed a new watermark and destroy the existing one. Although DeepSigns_2018arXiv demonstrated robustness against overwriting attack, it did not resolve the ambiguity resulted from the counterfeit watermark. Adi et al. TurnWeakStrength_Adi2018arXiv also discussed how to deal with an adversary who fine-tuned an already watermarked networks with new trigger set images. Nevertheless, TurnWeakStrength_Adi2018arXiv required the new set of images to be distinguishable from the true trigger set images. This requirement is however often unfulfilled in practice, and our experiment results show that none of existing watermarking methods are able to deal with ambiguity attacks explored in this paper (see Section 2).

In the context of digital image watermarking, ZeroKnowAmbAtt06 ; CombAmbAtt07 have studied ambiguity attacks that aim to create an ambiguous situation in which a watermark is reverse-engineered from an already watermarked image, by taking advantage of the invertibility of forged watermarks RevolveInvisible98 . It was argued that robust watermarks do not necessarily imply the ability to establish ownership, unless non-invertible watermarking schemes are employed (see Proposition 2 for our proposed solution).

2 Rethinking Neural Network Ownership Verification

This section analyzes and generalizes existing DNN watermarking methods in the face of ambiguity attacks. We must emphasize that the analysis mainly focuses on three aspects i.e. fidelity, robustness and invertibility of the ownership verification schemes, and we refer readers to representative previous work EmbedWMDNN_2017arXiv ; TurnWeakStrength_Adi2018arXiv ; DeepMarks_2018arXiv ; ProtectIPDNN_Zhang2018 for formulations and other desired features of the entire watermark-based intellectual property (IP) protection schemes, which are out of the scope of this paper.

2.1 Reformulation of DNN ownership verification schemes

Figure 1 summarizes the application scenarios of DNN model ownership verifications provided by the watermark based schemes. Inspired by RevolveInvisible98 , we also illustrate an ambiguous situation in which rightful ownerships cannot be uniquely resolved by the current watermarking schemes alone. This loophole is largely due to an intrinsic weakness of the watermark-based methods i.e. invertibility. Formally, the definition of DNN model ownership verification schemes is generalized as follows.

Definition 1.

A DNN model ownership verification scheme is a tuple of processes:

  1. An embedding process , is a DNN learning process that takes training data as inputs, and optionally, trigger set data or signature , and optimizes the model by minimizing the given loss .

    Remark: the DNN architectures are pre-determined by and, after the DNN weights are learned, either the trigger set or signatures will be embedded and can be verified by the verification process defined next111Learning hyper-parameters such as learning rate and the type of optimization methods are considered irrelevant to ownership verifications, and thus they are not included in the formulation..

  2. A fidelity evaluation process is to evaluate whether or not the discrepancy is less than a threshold i.e. , in which is the DNN performance tested against a set of test data where is the target performance.

    Remark: it is often expected that a well-behaved embedding process will not introduce a significant performance change greater than threshold . Nevertheless, this fidelity condition remains to be verified for networks under either removal attacks or ambiguity attacks.

  3. A signature verification process checks whether or not the expected signature or trigger set are successfully verified for a given DNN .

    Remark: for feature-based schemes, involves the detection of embedded signatures with false detection rate less than the threshold . Specifically, the detection boils down to measure the distances between target feature

    and features extracted by a transformation function

    parameterized by .

    Remark: for trigger-set based schemes, first invokes a DNN inference process that takes trigger set samples as inputs, then it checks whether the prediction produces the designated labels with false detection rate less than the threshold .

  4. An invert process exists and constitutes a successful ambiguity attack , if

    1. a set of new trigger set and/or signature can be reverse-engineered for a given DNN model;

    2. the forged can be successfully verified with respect to the given DNN weights i.e. ;

    3. the fidelity evaluation outcome defined in 1.II) remains True.

      Remark: this condition plays an indispensable role in designing the non-invertible verification schemes to defeat ambiguity attacks (see Section 3.3).

  5. If at least one invert process exists for a DNN verification scheme , then the scheme is called an invertible scheme and denoted by ; otherwise, the scheme is called non-invertible and denoted by .

The definition as such is abstract and can be instantiated by concrete implementations of processes and functions. For instance, the following combined loss function (Eq.

1) generalizes loss functions adopted by both the feature-based and trigger-set based watermarking methods:

(1)

in which are the relative weight hyper-parameters, are the network predictions with inputs or . is the loss function like cross-entropy that penalizes discrepancies between the predictions and the target labels or . Signature consists of passports and signature string . The regularization terms could be either as in EmbedWMDNN_2017arXiv or as in DeepMarks_2018arXiv .

It must be noted that, for DNN models that are used for classification tasks, the network performance tested against a dataset is independent of the embedded signature or trigger set . It is this independence that induces an invertible process for existing watermark-based methods as described next.

Proposition 1 (Invertible process).

For a DNN ownership verification scheme as in Definition 1, if the fidelity process is independent of either the signature or trigger set , then there always exists an invertible process i.e. the scheme is invertible .

2.2 Watermarking in the face of ambiguity attacks

Feature based method Trigger-set based method
Tran.L1/L2 Acc. % Real WM Det. % Fake WM Det. % Trans.L1/L2 Acc. % Real WM Det. % Fake WM Det. %
64.25 (90.07) 100 (100) 100 (100) 65.2 (91.03) 25.0 (100) 27.8 (100)
74.08 (90.07) 100 (100) 100 (100) 75.06 (91.03) 43.6 (100) 46.8 (100)
Table 1:

Detection accuracies of watermarks, before and after fine-tuning for transfer learning tasks. Trans.L1 denotes a network trained with CIFAR10 and weights fine-tuned for CIFAR100 (top row); and L2 fine-tuning for Caltech-101 (bottom row). Accuracy outside bracket is the transferred task, while in-bracket is the original task. WM Det. denotes detection accuracies of watermarks where accuracy outside/inside brackets is after/before fine-tuning respectively.

As proved by Proposition 1, one is able to construct forged watermarks for any already watermarked networks. We tested performances of two representative DNN watermarking methodsEmbedWMDNN_2017arXiv ; TurnWeakStrength_Adi2018arXiv , and Table 1

shows that fake watermarks can be forged for the given networks with 100% detection rate, and 100% fake trigger set images can be reconstructed as well. Given that the detection accuracies for the forged trigger set is slightly better than the original trigger set after fine-tuning the weights, the claim of the ownership is ambiguous and cannot be resolved by neither feature-based nor trigger-set based watermarking methods. Moreover, the computational cost of forging fake watermarks is minor where the forging took no more than 100 epochs of optimization without using the original training data.

In summary, the ambiguity attacks against watermarking are effective at minor computational cost even without the use of original training datasets. We ascribe this loophole to the crux that the loss of the original task i.e. is independent of the forged watermarks. In the next section, we shall illustrate a solution to defeat the ambiguity attacks.

3 Embedding passports for DNN ownership verification

The main motivation of embedding digital passports is to design and train DNN models in a way such that, the network performances of the original task will be significantly deteriorated due to the forged signatures. We shall illustrate next first how to implement the desired property by incorporating the so called passport layers, followed by different ownership protection schemes that exploit the embedded passports to effectively defeat ambiguity attacks.

3.1 Passport layers

(a) An example ResNet layer that consists of two convolution and two passporting layers. is the digital passports. is a passport function to compute the hidden parameters (i.e. and ) given in Eq. (2).
(b) A comparison of the distributions of the CIFAR10 classification accuracies (over the x-axis in %) of original DNN (red), DNN with valid passports (green), DNN with fake passports (blue), and DNN with reverse-engineered passports (orange).
Figure 2: (a) Example of passport layers and (b) Performances modulated by passports.

In order to control the network functionalities by embedded digital signatures i.e. passports, we proposed to append after a convolution layer a passport layer, whose scale factor and bias shift term are dependent on both the convolution kernels and the designated passport as follows:

(2)
(3)

in which denotes the convolution operations, is the layer number, is the input to the passport layer and is the input to the convolution layer.

is the corresponding linear transformation of outputs, while

and are the passports used to derive scale factor and bias term respectively. Figure 1(a) delineates the architecture of digital passport layers used in a ResNet layer.

Remark: for DNN models trained with passport , their inference performances depend on the running time passports i.e.

(4)

If the genuine passport is not presented , the running time performance is significantly deteriorated because the corresponding scale factor and bias terms are calculated based on the wrong passports. For instance, as shown in Figure 1(b), DNN model presented with valid passports (green) demonstrated almost identical accuracies as that of the original network (red), while the same DNN model presented with fake passports (blue) merely achieved about 10% classification rates.

Remark: the gist of the proposed passport layer is to enforce dependence between scale factor, bias terms and network weights. As shown by the Proposition 2, it is this dependence that validates the required non-invertibility to defeat ambiguity.

Proposition 2 (Non-invertible process).

A DNN ownership verification scheme as in Definition 1 is non-invertible, if

  1. the fidelity process outcome depends either on the presented signature or trigger set ,

  2. with forged passport , the DNN performance in (4) is deteriorated such that the discrepancy is larger than a threshold i.e. .

3.2 Embedding of binary signatures by sign of scale factor

During learning of the DNN weights, one can enforce scale factor to take either positive or negative signs as designated by adding the following sign loss regularization term into the combined loss (1):

(5)

in which consists of the designated binary bits for convolution kernels, and is a positive control parameter (0.1 by default unless stated otherwise) to encourage the scale factors have magnitudes greater than .

It must be highlighted that the inclusion of sign loss (Eq. 5) enforces the scale factors to take either positive or negative values, and the signs enforced this way remain rather persistent against various adversarial attacks. This feature explains the superior robustness of embedded passports against ambiguity attacks by reverse-engineering shown in Section 4.2.

(a)
(b)
(c)
Figure 3: A comparison of three different ownership verification schemes with passports.

3.3 Ownership verification with passports

Taking advantages of the proposed passport embedding method, we design three ownership verification schemes that are summarized in Fig. 3. We briefly introduce them next and refer readers to Section 4 for experiment results.

: Ownership verification when passport is distributed with the trained DNN model

First, the learning process aims to minimize the combined loss function (Eq. 1), in which since trigger set images are not used in this scheme and the sign loss (Eq. 5) is added as the regularization term. The trained DNN model together with the passport are then distributed to legitimate users, who perform network inferences with the given passport fed to the passport layers as shown in Figure 1(a). The network ownership is automatically verified by the distributed passports. As shown by Table 2 and Figure 4 in Section 4.1, this ownership verification is robust to fine-tuning and pruning of the DNN weights. Also, as shown by Figure 5 in Section 4.2, ambiguity attacks cannot successfully forge a set of passport and signature that can maintain the network performance.

The downside of this scheme is the requirement to use passports during inferencing, which leads to extra computational cost by about 10%. Also the distribution of passports to end-users is intrusive and imposes additional responsibility of locking away passports safely.

: Ownership verification when private passport is embedded but not distributed

Herein, the learning process aims to simultaneously achieve two goals, of which the first is to minimize the original task loss (e.g. CIFAR10 classification) when no passport layers included; and the second is to minimize the combined loss function (Eq. 1) with passports regularization included. Algorithm-wise, this multi-task learning is achieved by alternating between the minimization of two goals repeated. The successfully trained DNN model is then distributed to end-users, who may perform network inference without the need of passports. Note that this is possible since passport layers are not included in the distributed networks. Ownership verification is carried out only as requested by the law enforcement, by adding the passport layers to the network in question and detecting the embedded sign signatures with unyielding original network performances.

Compared with scheme , this scheme is easy to use for end-users since no passport is needed and no extra computational cost is incurred. In the meantime, ownership verification is robust to removal attacks as well as ambiguity attacks. The downside, however, is the requirement to access the DNN weights and to append passport layers for ownership verification i.e the disadvantages of white-box protection mode as discussed in TurnWeakStrength_Adi2018arXiv . Therefore, we propose to combine it with trigger-set based verification that will be described next.

: Ownership verification when private passport and trigger set are embedded but not distributed

This scheme only differs from scheme in that, a set of trigger images is embedded in addition to embedded passports. The advantage of this, as discussed in TurnWeakStrength_Adi2018arXiv is to probe and claim ownership of suspect DNN model through remote calls of service APIs. This capability allows first to claim ownership in a black-box mode, followed by reassertion of ownership with passport verification in a white box mode. Algorithm-wise, the embedding of trigger set images is jointly achieved in the same minimization process that embeds passports in scheme . Finally, it must be noted that the embedding of passports in schemes and are implemented through multi-task learning tasks where we adopted group normalisation wu2018group instead of batch normalisation ioffe2015batch that is not applicable to multi-task learning due to its dependency on running average of batch-wise training samples.

4 Experiment results

AlexNet CIFAR10 To CIFAR100 To Caltech-101
Baseline (BN) - (91.12) - (65.53) - (76.33)
Scheme 100 (90.91) 100 (64.64) 100 (73.03)
Baseline (GN) - (90.88) - (62.17) - (73.28)
Scheme 100 (89.44) 99.91 (59.31) 100 (70.87)
Scheme 100 (89.15) 99.96 (59.41) 100 (71.37)
ResNet CIFAR10 To CIFAR100 To Caltech-101
Baseline (BN) - (94.85) - (72.62) - (78.98)
Scheme 100 (94.62) 100 (69.63) 100 (72.13)
Baseline (GN) - (93.65) - (69.40) - (75.08)
Scheme 100 (93.41) 100 (63.84) 100 (71.07)
Scheme 100 (93.26) 99.98 (63.61) 99.99 (72.00)
AlexNet CIFAR100 To CIFAR10 To Caltech-101
Baseline (BN) - (68.26%) - (89.46%) - (79.66%)
Scheme 100% (68.31%) 100% (89.07%) 100% (78.83%)
Baseline (GN) - (65.09%) - (88.30%) - (78.08%)
Scheme 100% (64.09%) 100% (87.47%) 100% (76.31%)
Scheme 100% (63.67%) 100% (87.46%) 100% (75.89%)
ResNet CIFAR100 To CIFAR10 To Caltech-101
Baseline (BN) - (76.25%) - (93.22%) - (82.88%)
Scheme 100% (75.52%) 100% (95.28%) 99.99% (79.27%)
Baseline (GN) - (72.06%) - (91.83%) - (79.15%)
Scheme 100% (72.15%) 100% (90.94%) 100% (77.34%)
Scheme 100% (72.10%) 100% (91.30%) 100% (77.46%)
Table 2: Performance of passport network (%) and robustness against fine-tuning where BN = batch normalisation and GN = group normalisation. (Left: trained with CIFAR10 and transferred for tasks CIFAR100 / Caltech-101; Right: trained with CIFAR100 and transferred for CIFAR10 / Caltech-101.)

This section illustrates the experiment results of passport-based ownership verification schemes whereas the performances of various schemes are compared in terms of robustness to both removal attacks and ambiguity attacks. Due to the page limit, we are unable to include all results and only highlight robustness against fine-tuning, pruning and various ambiguity attacks. The network architectures we investigated include the well known AlexNet and ResNet, which are tested with typical CIFAR10/CIFAR100 classification tasks. These medium-sized datasets allow us to perform extensive tests of the DNN performances. Unless stated otherwise, all experiments are repeated 5 times and tested against 50 fake passports to get the mean performance.

4.1 Robustness against removal attacks

Robustness against fine-tuning

In this experiment, for each DNN model, we repeatedly trained five times with designated scale factor signs embedded. The passport signatures are then detected at 100% detection rates for all three ownership verification schemes. Table 2 shows that even after fine-tuning the network for other classification tasks (e.g. from CIFAR10 to Caltech-101), the 100% detection rates of embedded passports are still maintained. Note that a detected passport signature is claimed only if all binary bits are exactly matched. We ascribe this superior robustness to the unique controlling nature of the scale factors — in case that a scale factor value is reduced near to zero, the channel output will be virtually zero, thus, its gradient will vanish and lose momentum to move towards to the opposite value. Empirically we have not observed counter-examples against this explanation222A rigorous proof of this argument is under investigation and will be reported elsewhere..

Robustness against pruning

In this experiment, we test the passport-embedded models against attacks with certain percentage of the DNN weights are being pruned. This type of weight pruning strategy has been adopted for network compression and Figure 4 shows that, for CIFAR10 classification, a passport signature detection accuracy near 100% is maintained at the pruning percentage around 60%, and the detection rate still reach 70% even though 90% of the DNN weights are pruned. As said, we ascribe the robustness against pruning to the superior persistence of signatures embedded in the scale factor signs (see Section 3.2).

(a) Pruning on CIFAR10
(b) Pruning on CIFAR100
Figure 4: DNN performances and passport signature detection rate vs. # of DNN weights pruned.

4.2 Resilience against ambiguity attacks

As shown in Figure 5, for both AlexNet and ResNet trained for CIFAR10 classification task, the network performance margin is significantly depending on the presence of either valid or fake passports — DNN model presented with the valid passports demonstrated almost identical accuracies as that of the original DNN model, while the same DNN model presented with fake passports (in this case fake = random attack) achieved about 10% classification rates which is merely equivalent to a random guessing. In the case of fake, we assume the adversaries have access to the original training dataset, and attempt to reverse-engineer the scale factor and bias term by freezing the trained DNN weights. It is shown that in Figure 5, it is only able to achieve at most 84% for AlexNet and 70% for ResNet. While in CIFAR100 classification task, for fake case, attack on AlexNet and ResNet achieved about 1%; for fake case, attacks achieved 44% for AlexNet and 35% for ResNet.

(a) AlexNet. Left: CIFAR10, right: CIFAR100.
(b) ResNet. Left: CIFAR10, right: CIFAR100.
Figure 5: DNN performances with valid passports and two different types of fake passports i.e. random attack fake and ambiguity attack fake
Ambiguity attack
modes
Attackers have
access to
Ambiguous passport
construction methods
Invertibility
(see Def. 1.V)
Verification scheme
Verification scheme
Verification scheme
fake Random passport failed by big margin
Accuracy
(68% 1%)
Accuracy
(65% 1%)
Accuracy
(65% 1%)
fake , {;} Reverse engineer passport failed by moderate margin
Accuracy
(68% 30-45%)
Accuracy
(65% 20-30%)
Accuracy
(65% 20-30%)
fake
, {;},
{, }
Reverse engineer passport {;}
by exploiting original passport
& sign string
if :
passed, with negligible margin
if :
failed, by moderate to big margin
See Figure 6 See Figure 6 See Figure 6
Table 3: Performances(%) of , and schemes under three ambiguity attack modes, .

Table 3 summarize the performances of the proposed methods under three ambiguity attack modes, depending on attackers’ knowledge of the protection mechanism. It shows that all the corresponding network performances are deteriorated to various extent. The ambiguous attacks are therefore defeated according to the fidelity evaluation process, . We’d like to highlight that even under the most adversary condition i.e. freezing weights, maximizing distance from original passport, and minimizing accuracy loss (we class this as fake), attackers are still unable to change scale signs without compromising network performances. For example, with 10% and 50% of scale sign changes, the CIFAR100 classification accuracy drops about 5% and 50%, respectively. In case that the sign remain unchanged, network ownership can be easily verified by the pre-defined string of signs. Also, Table 3 shows that attackers are unable to exploit to forge ambiguous passports.

Based on these empirical studies, we can set the threshold in Definition 1 as 3% and 20%, respectively, for AlexNet and ResNet. By this fidelity evaluation process, any potential ambiguity attacks are effectively defeated. In summary, extensive empirical studies show that it is impossible for adversaries to maintain the original DNN model performances by using fake passports, regardless of that fake passports are either randomly generated or reverse-engineered with the use of original training datasets. This passport dependent performances play indispensable role in design secure ownership verification schemes that are illustrated in Section 3.3.

Figure 6: Test accuracy on CIFAR100 as suggested by R1 (i.e. try to create fake passport maximizing distance from .)
Training
- Passport layers added
- Passports needed
- 15-30% more training time
- Passport layers added
- Passports needed
- 100-125% more training time
- Passport layers added
- Passports needed
- Trigger set needed
- 100-150% more training time
Inferencing
- Passport layers & passports needed
- 10% more inferencing time
- Passport layers & passport NOT needed
NO extra time incurred
- Passport layers & passport NOT needed
NO extra time incurred
Verification - NO separate verification needed - Passport layers & passports needed
- Trigger set needed (black-box verification)
- Passport layers & passports needed (white-box verification)
Table 4: Summary of network complexity for , and schemes.

4.3 Network Complexity

Table 4 summarizes network complexity for various schemes. We believe it is the complexity and time cost during the inferencing stage that is to be minimized, since network inferences are to be performed frequently by end users. While extra costs at the training and verification stages, on the other hand, are not prohibitive since they are performed by network owners, with the motivation to protect network ownerships. Nonetheless, we tested a Resnet50 and its training time increases 10%, 182% and 191% respectively for , and schemes. This increase is consistent with smaller models i.e. Alexnet and Resnet18.

5 Discussions and conclusions

Considering billions of dollars have been invested by giant and startup companies to explore new DNN models virtually every second, we believe it is imperative to protect these inventions from being stolen. While ownership of DNN models might be resolved by registering the models with a centralized authority, it has been recognized that these regulations are inadequate and technical solutions are urgently needed to support the law enforcement and juridical protections. It is this motivation that highlights the unique contribution of the proposed method in unambiguous verification of DNN models ownerships.

Methodology-wise, our empirical studies re-asserted that over-parameterized DNN models can successfully learn multiple tasks with arbitrarily assigned labels and/or constraints. While this assertion has been theoretically proved ConvergOverPara_Zhu18 and empirically investigated from the perspective of network generalization rethink_generalize_2016arXiv , its implications to network security in general remain to be explored. We believe the proposed modulation of DNN performance based on the presented passports will play an indispensable role in bringing DNN behaviors under control against adversarial attacks, as it has been demonstrated for DNN ownership verifications.

Acknowledgement

This research is partly supported by the Fundamental Research Grant Scheme (FRGS) MoHE Grant FP021-2018A, from the Ministry of Education Malaysia. Also, we gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan V GPU used for this research.

References