Global Robustness Evaluation of Deep Neural Networks with Provable Guarantees for L0 Norm

04/16/2018 ∙ by Wenjie Ruan, et al. ∙ University of Liverpool 0

Deployment of deep neural networks (DNNs) in safety or security-critical systems demands provable guarantees on their correct behaviour. One example is the robustness of image classification decisions, defined as the invariance of the classification for a given input over a small neighbourhood of images around the input. Here we focus on the L_0 norm, and study the problem of quantifying the global robustness of a trained DNN, where global robustness is defined as the expectation of the maximum safe radius over a testing dataset. We first show that the problem is NP-hard, and then propose an approach to iteratively generate lower and upper bounds on the network's robustness. The approach is anytime, i.e., it returns intermediate bounds and robustness estimates that are gradually, but strictly, improved as the computation proceeds; tensor-based, i.e., the computation is conducted over a set of inputs simultaneously, instead of one by one, to enable efficient GPU computation; and has provable guarantees, i.e., both the bounds and the robustness estimates can converge to their optimal values. Finally, we demonstrate the utility of the proposed approach in practice to compute tight bounds by applying and adapting the anytime algorithm to a set of challenging problems, including global robustness evaluation, guidance for the design of robust DNNs, competitive L_0 attacks, generation of saliency maps for model interpretability, and test generation for DNNs. We release the code of all case studies via Github.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 14

page 26

page 33

page 34

page 35

page 36

page 37

page 38

Code Repositories

L0-TRE

L0 Norm Tensor-based Robustness Evaluation


view repo

L0-TRE

L0 Norm Tensor-based Robustness Evaluation


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep neural networks (DNNs) have achieved significant breakthroughs in the past few years and are now being deployed in many applications. However, in safety-critical domains, where human lives are at stake, and security-critical applications, which often have significant financial risks, concerns have been raised about the reliability of this technique. In established industries, e.g., avionics and automotive, such concerns have to be addressed during the certification process before the deployment of the product. During the certification process, the manufacturer needs to demonstrate to the relevant certification authority, e.g., the European Aviation Safety Agency or the Vehicle Certification Agency, that the product behaves correctly with respect to a set of high-level requirements. For this purpose, it is necessary to develop techniques for discovering critical requirements and supporting the case that these requirements are met by the product.

Safety certification for DNNs is challenging owing to the black-box nature of DNNs and the lack of rigorous foundations. An important low-level requirement for DNNs is the robustness to input perturbations. DNNs have been shown to suffer from poor robustness because of their susceptibility to adversarial examples [29]. These are small modifications to an input, sometimes imperceptible to humans, that make the network unstable. As a result, significant effort has been directed towards approaches for crafting adversarial examples or defending against them [5, 18, 2]. However, the cited approaches provide no formal guarantees, i.e., no conclusion can be made whether adversarial examples remain or how close crafted adversarial examples are to the optimal ones.

Recent efforts in the area of automated verification [8, 9] have instead focused on methods that generate adversarial examples, if they exist, and provide rigorous robustness proofs otherwise. These techniques rely on either a layer-by-layer exhaustive search of the neighbourhood of an image [8], or a reduction to a constraint solving problem by encoding the network as a set of constraints [9]. Constraint-based approaches are limited to small networks. Exhaustive search, on the other hand, applies to large networks but suffers from the state-space explosion problem. To mitigate this, a Monte-Carlo tree search has been employed [31]. Moreover, a game-based approximate verification approach that can provide provable guarantees has been proposed[32].

This paper proposes a novel approach to quantify the robustness of DNNs that offers a balance between the guaranteed accuracy of the method (thus, a feature so far exclusive to formal approaches) and the efficiency of algorithms that search for adversarial examples (without providing any guarantees). We consider the global robustness problem, which is a generalisation of the local, pointwise robustness problem. Specifically, we define a maximum safe radius for every input and then evaluate robustness over a given test dataset, i.e., a finite set of inputs. Global robustness is defined as the expected maximum safe radius over the test examples. We focus on the norm, which measures the distance between two matrices (e.g., two input images) by counting the number of elements (e.g., pixels) that are different.

The key idea of our approach is to generate sequences of lower and upper bounds for global robustness. Our method is anytime, tensor-based, and offers provable guarantees. First, the method is anytime in the sense that it can return intermediate results, including upper and lower bounds and robustness estimates. We prove that our approach can gradually, but strictly, improve these bounds and estimates as the computation proceeds. Second, it is tensor-based. As we are working with a set of inputs, a straightforward approach is to perform robustness evaluation for the inputs individually and to then merge the results. However, this is inefficient, as the set of inputs is large. To exploit the parallelism offered by GPUs, our approach uses tensors. A tensor is a finite set of multi-dimensional arrays, and each element of the set represents one input. A good tensor-based algorithm uses tensor operations whenever possible. Third, our approach offers provable guarantees. We show that the intermediate bounds and the robustness estimates will converge to their optimal values in finite time, although this may be impractical for large networks.

We implement our approach in a tool we name (“Tensor-based Robustness Evaluation for the Norm”), and conduct experiments on a set of challenging problems, including Case Study 1: global robustness evaluation; Case Study 2: competitive attacks; Case Study 3: test case generation; Case Study 4: guidance for the design of robust DNN architectures; and Case Study 5: saliency map generation for model interpretability and local robustness evaluation on five ImageNet DNNs including AlexNet, VGG-16/19 and ResNet-50/101.

All applications above require only simple adaptations of our method, e.g., slight modifications of the constraints or objective functions, or the addition of extra constraints. This demonstrates that our new technique is flexible enough to deliver a wide range of promising applications. The main contributions of this paper are as follows:

  • We propose a novel method to quantify global robustness of DNNs w.r.t. the -norm. This offers two key advantages, including i) theoretical lower and upper bounds to guarantee its convergence; and ii) explicit tensor-based parallelisation on GPUs with high computational efficiency.

  • With simple adaptations, we show the utility of the proposed method on a broad range of applications, including i) anytime global robustness evaluation; ii) competitive adversarial attacks; and iii) test case generation, etc.

  • We perform a rigorous theoretical analysis and extensive empirical case studies to support the claims above. We test our tool on 15 different deep neural networks, including eight MNIST DNNs, two CIFAR-10 DNNs and five ImageNet DNNs.

2 Problem Formulation

Let be an -layer neural network such that, for a given input , represents the confidence values for classification labels. Specifically, we have

(1)

where and for are learnable parameters, and is the function that maps the output of layer , i.e., , to the input of layer . Without loss of generality, we normalise the input to . The output is usually normalised to be in

with a softmax layer. We denote the classification label of input

by . Note that both and can be generalised to work with a set of inputs, i.e., and , in the standard way.

Definition 1 (Safe Norm Ball)

Given a network , an input , a distance metric and a real number , a norm ball is a subspace of such that

(2)

The number is called the radius of . A norm ball is safe if for all we have .

Intuitively, a norm ball includes all inputs whose distance to , measured by a metric , is within .

Definition 2 (Maximum Radius of a Safe Norm Ball)

Let be the radius of a safe norm ball . If for all we have that is not safe, then is called the maximum safe radius, denoted by . Formally,

(3)

We define the (global) robustness evaluation problem over a testing dataset , which is a set of i.i.d. inputs sampled from a distribution representing the problem is working on. We use to denote the number of inputs in . When , we call it local robustness.

Definition 3 (Robustness Evaluation)

Given a network , a finite set of inputs, and a distance metric , the robustness evaluation, denoted as , is an optimisation problem:

(4)

where , , and .

Intuitively, we aim to find a minimum distance between the original set and a new, homogeneous set of inputs such that all inputs in are misclassified. The two sets and are homogeneous if they have the same number of elements and their corresponding elements are of the same type.

Norm

The distance metric can be any mapping that satisfies the metric conditions. In this paper, we focus on the metric. For two inputs and , their distance, denoted as , is the number of elements that are different. When working with test datasets, we define

(5)

where is a homogeneous input of . While other norms such as , and have been widely applied for generating adversarial examples [18, 11], studies based on the norm are few and far between. In the Appendix, we justify why is the appropriate metric for our goals.

3 Anytime Robustness Evaluation

The accurate evaluation of robustness in Definition 3 is hard in terms of -norm distance. In Appendix 0.A.1, we give the computational complexity and prove its NP-hardness.

In this paper, we propose to compute lower and upper bounds, and then gradually, but strictly, improve the bounds so that the gap between them can eventually be closed in finite time. Although the realistic running time can be long, this anytime approach provides pragmatic means to track progress. Experimental results in Section 5 show that our approach is able to achieve tight bounds efficiently in practice.

Definition 4 (Sequences of Bounds)

Given a robustness evaluation problem , a sequence is an incremental lower bound sequence if, for all , we have . The sequence is strict, denoted as , if for all , we have either or . Similarly, we can define a decremental upper bound sequence and a strict decremental upper bound sequence .

We will, in Section 4, introduce our algorithms on computing these two sequences of lower and upper bounds. For now, assume they exist, then at a certain time , holds.

Definition 5 (Anytime Robustness Evaluation)

For a given range , we define its centre and radius as follows.

(6)

The anytime evaluation of at time , denoted as , is the pair .

The anytime evaluation will be returned whenever the computational procedure is interrupted. Intuitively, we use to represent the current estimation, and to represent its error bound. Essentially, we can bound the true robustness via the anytime robustness evaluation. Let be a network and a distance metric. At any time , the anytime evaluation such that

(7)

4 Tensor-based Algorithms for Upper and Lower Bounds

We present our approach to generate the sequences of bounds.

Definition 6 (Complete Set of Subspaces for an Input)

Given an input and a set of dimensions such that , the subspace for , denoted by , is a set of inputs such that for and for . Furthermore, given an input and a number , we define

(8)

as the complete set of subspaces for input .

Intuitively, elements in share the same value with on the dimensions other than , and may take any legal value for the dimensions in . Moreover, includes all sets for any possible combination with dimensions.

Next, we define the subspace sensitivity for a subspace w.r.t. a network , an input  and a test dataset . Recall that .

Definition 7 (Subspace Sensitivity)

Given an input subspace , an input and a label , the subspace sensitivity w.r.t. , , and is defined as

(9)

Let be an integer. We define the subspace sensitivity for and as

(10)

where is the classification label of by network .

Intuitively, is the maximal decrease of confidence value of the output label  that can be witnessed from the set , and is the two-dimensional array of the maximal decreases of confidence values of the classification labels for all subspaces in and all inputs in . It is not hard to see that .

Given a test dataset and an integer , the number of elements in is in , i.e., polynomial in and exponential in . Note that, by Equation (9), every element in represents an optimisation problem. E.g., for , a set of 20 MNIST images, and , this would be one-dimensional optimisation problems. In the next section, we give a tensor-based formulation and an algorithm to solve this challenging problem via GPU parallelisation.

4.1 Tensor-based Parallelisation for Computing Subspace Sensitivity

A tensor in an -dimensional space is a mathematical object that has

components and obeys certain transformation rules. Intuitively, tensors are generalisations of vectors (i.e., one index) and matrices (i.e., two indices) to an arbitrary number of indices. Many state-of-the-art deep learning libraries, such as Tensorflow and Keras, are utilising the tensor format to parallelise the computation with GPUs. However, it is nontrivial to write an algorithm working with tensors due to the limited set of operations on tensors.

The basic idea of our algorithm is to transform a set of nonlinear, noncovex optimisation problems as given in Equation (10) into a tensor formulation, and solve a set of optimisation problems via a few DNN queries. First, we introduce the following operations on tensors we use in our algorithm.

Definition 8 (Mode- Unfolding and Folding)

Given a tensor , the mode-n unfolding of is a matrix such that and is defined by the mapping from element to , with

Accordingly, the tensor folding folds an unfolded tensor back from a matrix to a full tensor. Tensor unfolding and folding are dual operations and link tensors and matrices.

Given a neural network , a number and a test dataset , each generates a complete set of subspaces. Let and . Note that for different and , we have . Given an error tolerance , by applying grid search, we can recursively sample numbers in each dimension, and turn each subspace into a two-dimensional grid . We can formulate the following tensor:

(11)

In Sec. 4.3, we show that grid search provides the guarantee of reaching the global minimum by utilizing the Lipschitz continuity in DNNs.

Then, we apply the mode-1 tensor unfolding operation to have such that . Then this tensor can be fed into the DNN to obtain

(12)

After computing , we apply a tensor folding operation to obtain

(13)

Here, we should note the difference between and , with the former being a one-dimensional array and the latter a tensor. On , we search the minimum values along the first dimension to obtain222Here we use a Matlab notation , which computes the minimum values over the -th dimension for a multi-dimensional array . Other notation to appear later is similar.

(14)

Thus, we have now solved all optimisation problems. We then construct the tensor

(15)

from the set . Recall that . Intuitively, is the tensor that contains the starting points of the optimisation problems and the resulting optimal values. The following theorem shows the correctness of our computation, where has been defined in Definition 7.

Theorem 1

Let be a test dataset and an integer. We have .

To perform the computation above, we only need a single DNN query in Equation (12).

4.2 Tensor-based Parallelisation for Computing Lower and Upper Bounds

Let be the tensor obtained by replacing every element in with their corresponding inputs that, according to the computation of , cause the largest decreases on the confidence values of the classification labels. We call the solution tensor of . The computation of can be done using very few tensor operations over and , which have been given in Section 4.1. We omit the details.

Lower Bounds

We reorder and w.r.t. the decreased values in . Then, we retrieve the first row of the third dimension in tensor , i.e., , and check whether . The result is an array of Boolean values, each of which is associated with an input . If any element associated with in the resulting array is , we conclude that , i.e., the maximum safe radius has been obtained and the computation for has converged. On the other hand, if the element associated with is , we update the lower bound for to . After computing , no further DNN query is needed to compute the lower bounds.

Upper Bounds

The upper bounds are computed by iteratively applying perturbations based on the matrix for every input in until a misclassification occurs. However, doing this sequentially for all inputs would be inefficient, since we need to query the network after every perturbation on each image.

We present an efficient tensor-based algorithm, which enables GPU parallelisation. The key idea is to construct a new tensor to maintain all the accumulated perturbations over the original inputs .

  • Initialisation: .

  • Iteratively construct the -th row until :

    (16)

where , , and are tensor operations: removes the corresponding non-zero elements in from ; further, retains those elements that have the same values and sets the other elements to 0; finally, merges the non-zero elements from two tensors. The two operands of these operations are required to have the same type. Intuitively, represents the result of applying the first perturbations recorded in .

Subsequently, we unfold and pass the result to the DNN , which yields the classification labels . After that, a tensor folding operation is applied to obtain . Finally, we can compute the minimum column index along each row such that misclassification happens, denoted by such that . Then we let

(17)

which is the optimal set of inputs as required in Definition 3.

After computing , we only need one further DNN query to obtain all upper bounds for a given test dataset .

Tightening the Upper Bounds

There may be redundancies in , i.e., not all the changes in are necessary to observe misclassification. We therefore reduce the redundancies and thereby tighten the upper bounds. We reduce the tightening problem to an optimisation problem similar to that of Definition 3, which enables us to reuse the tensor-based algorithms given above.

Assume that and are two corresponding inputs in and , respectively, for . By abuse of notation, we let be the part of on which and are different, and be the part of on which and are the same. Therefore, .

Definition 9 (Tightening the Upper Bounds)

Given a network , a finite test dataset with their upper bounds , and a distance metric , the tightening problem is an optimisation problem:

(18)

where , , have the same shape.

To solve this optimisation problem, we can re-use the tensor-based algorithm for computing lower bounds with minor modifications to the DNN query: Before querying DNN, we apply the operation to merge with , as suggested by Equation (18).

4.3 Convergence

We perform convergence analysis of the proposed method. For simplicity, in the proofs we consider the case of a single input . The convergence guarantee can be extended easily to a finite set. We first show that grid search can guarantee to find the global optimum given a certain error bound based on the assumption that the neural network satisfies the Lipschitz condition as proved in [31, 19].

Theorem 2 (Guarantee of the global minimum of grid search)

Assume a neural network is Lipschitz continuous w.r.t. a norm metric and its Lipschitz constant is . By recursively sampling in each dimension, denoted as , the following relation holds:

where represents the global minimum value, denotes the minimum value returned by grid search, and is an all-ones matrix.

Proof 1

Based on the Lipschitz continuity assumption of , we have

The grid search guarantees such that , denoted as . Thus the theorem holds as we can always find from the sampled set for the global minimum .

As shown in Sec. 4.2, in each iteration, we apply the grid search to verify the safety of the DNNs (meaning that we preclude adversarial examples) given a lower bound. In combination with Theorem 2, we arrive at the following, which shows the safety guarantee for the lower bounds.

Theorem 3 (Guarantee for Lower Bounds)

Let denote a DNN and let be an input. If our method generates a lower bound , then for all such that . I.e.,  is guaranteed to be safe for any pixel perturbations with at most pixels.

Theorem 3 (proof in Appendix 0.A.2) shows that the lower bounds generated by our algorithm are the lower bounds of . We gradually increase and re-run the lower bound generation algorithm. Because the number of dimensions of an input is finite, the distance to an adversarial example is also finite. Therefore, the lower bound generation algorithm converges eventually.

Theorem 4 (Guarantee for Upper Bounds)

Let denote a DNN and denote an input. Let be an upper bound generated by our algorithm for any . Then we have for all , and .

The three key ingredients to show that the upper bounds decrease monotonically are: i) the complete subspaces generated at are always included in the complete subspaces at ; ii) the pixel perturbation from a subspace with higher priority always results in a larger confidence decrease than those with lower priority; and iii) the tightening strategy is able to exclude the redundant pixel perturbations. The details of the proof for Theorem 4 are in Appendix 0.A.3. Finally, we can show that the radius of will converge to 0 deterministically (see Appendix 0.A.4).

5 Experimental Results

We report experimental evidence for the utility of our algorithm. Some experiments require simple modifications of the optimisation problem given in Definition 3, e.g., small changes to the constraints. No significant modification to our algorithm is needed process to these variants. In this section, we use five case studies to demonstrate the broad applicability of our tool 333The Case Study Four and Case Study Five are available in the Appendix.

5.1 Case Study One: Convergence Analysis and Global Robustness Evaluation

(a) (b) (c)
Figure 1: (a) Convergence of lower bound, upper bound, and estimation of for one image; (b) Convergence of lower bound, upper bound, and estimation of global robustness; (c) Boxplots of the computational time for
(a) (b)
Figure 2: (a) sDNN: Upper bounds, lower bounds, and estimations of for all sampled images for ordered from the top to bottom. (b) DNN-0: Upper bounds, lower bounds, and estimations of for all sampled input images for

We study the convergence and running time of our anytime global robustness evaluation algorithm on several DNNs in terms of the -norm. To the best of our knowledge, no baseline method exists for this case study. The -norm based algorithms, which we compare against in Section 5.2, cannot perform robustness evaluation based on both lower and upper bounds with provable guarantees.

We train two DNNs on the MNIST dataset, with DNN-0 being trained on the original images of size and sDNN on images resized into . The model structures are given in Appendix 0.F. For DNN-0, we work with a set of 2,400 images randomly sampled from the dataset, and for sDNN, we work with a set of 5,300 images.

5.1.1 sDNN: Speed of Convergence and Robustness Evaluation

Fig. 1 (a) illustrates the speed of convergence of lower and upper bounds as well as the estimate for (i.e., the maximum safe radius) for an image with a large initial upper bound at distance 27. This image is chosen to demonstrate the worst case for our approach. Working with a single image (i.e., local robustness) is the special case of our optimisation problem where . We observe that, when transitioning from to , the uncertainty radius of is significantly reduced from 26 to 1, which demonstrates the effectiveness of our upper bound algorithm. Fig. 1 (b) illustrates the speed of convergence of the global robustness evaluation on the testing dataset: Our method obtains tight upper and lower bounds efficiently and converges quickly. Notably, we have at ; the final global robustness is , and thus, the relative error of the global robustness at is less than . The estimate at can be obtained in polynomial time, and thus, our experimental results suggest that our approach provides a good approximation for this challenging NP-hard problem with reasonable error at very low computational cost. Fig. 1 (c) gives the boxplots of the computational time required for individual iterations (i.e., subspace dimensions ). We remark that at it takes less than  s to process one image, which suggests that the algorithm has potential for real-time applications.

In Fig. 2 (a), we plot the upper and lower bounds as well as the estimate for for all images in the testing dataset. The images are ordered using their upper bounds at . The dashed blue line indicates that all images left of this line have converged. The charts show a clear overall trend: our algorithm converges for most images with after a very small number of iterations.

Figure 3: Robustness evaluation of DNN-0 for and box-plots of computation time
Figure 4:

Means and standard deviations of the adversarial

distance
Figure 5: Means and standard deviations of computational time of all methods
Figure 6: Means and standard derivations of computational time in CPU and GPU for 1,000 MNIST and CIFAR-10 images

5.1.2 DNN-0: Global Robustness Evaluation

Fig. 2

(b) illustrates the overall convergence trends for all 2,400 images for our large DNN. We observe that even for a DNN with tens of thousands of hidden neurons,

achieves tight estimates for for most images. Fig. 4 gives the results of anytime global robustness evaluation at and for DNN-0. The results show the feasibility and efficiency of our approach for anytime global robustness evaluation of safety-critical systems. Fig. 12 in Appendix 0.F features a selection of the ground-truth adversarial images444Ground-true adversarial images mean images that are at the boundary of a safe norm ball, which is first proposed in [1]. returned by our upper bound algorithm.

5.2 Case Study Two: Attacks

While the generation of attacks is not the primary goal of our method, we observe that our upper bound generation method is highly competitive with state-of-the-art methods for the computation of adversarial images. We train MNIST and CIFAR-10 DNNs and compare with JSMA [18], C&W [2], DLV [8], SafeCV [31] and DeepGame [32], on 1,000 testing images. Details of the experimental settings are given in Appendix 0.G.

5.2.1 Adversarial Distance

Fig. 4 depicts the average and standard deviations of distances of the adversarial images produced by the five methods. A smaller distance indicates an adversarial example closer to the original image. For MNIST, the performance of our method is better than JSMA, DLV, and SafeCV, and comparable to C&W and DeepGame. For CIFAR-10, the bar-chart reveals that our achieves the smallest distance (modifying 2.62 pixels on average) among all competitors. For this experiment, we stop at without performing further iterations.

5.2.2 Computational Cost

Fig. 6 (note log-scale) gives runtimes. Our tensor-based parallelisation method delivers extremely efficient attacks. For example, for MNIST, our method is , , , and faster than JSMA, C&W, DLV, and SafeCV, respectively. Figure 6 shows that the tensor-based parallelisation555CPU-1: Tensorflow (Python) on i5-4690S CPU; GPU-1: Tensorflow (Python) with parallelisation on NVIDIA GTX TITAN GPU.  CPU-2: Deep Learning Toolbox (Matlab2018b) on i7-7700HQ CPU; GPU-2: Deep Learning Toolbox (Matlab2018b) with parallelisation on NVIDIA GTX-1050Ti GPU. significantly improves the computational efficiency in terms of 38 times faster on MNIST DNN and 93 times faster on CIFAR-10 DNN. Appendix 0.G compares some of the adversarial examples found by the five methods. The examples illustrate that the modification of one to three pixels suffices to trigger misclassification even in well-trained neural networks.

5.3 Case Study Three: Test Case Generation for DNNs

L0-TRE DeepConcolic [28]   DeepXplore [20]  (%)
(%) (%)   light occlusion blackout
 MNIST 98.95 97.60   80.77 82.68 81.61
 CIFAR-10 98.63 84.98   77.56 81.48 83.25
Table 1: Neuron coverage achieved by , DeepConcolic and DeepXplore
Figure 7: Some adversarial examples found by our tool while generating test cases for high neuron coverage on MNIST and CIFAR-10 DNNs

A variety of methods to automate testing of DNNs has been proposed recently [20, 27, 28]. The most widely used metric for the exhaustiveness of test suites for DNNs is neuron coverage [20]. Neuron coverage quantifies the percentage of hidden neurons in the network that are activated at least once. We use to range over hidden neurons, and to denote the activation value of for test input . Then implies that is covered by the test input .

The application of our algorithm to coverage-driven test case generation is straight forward; it only requires a minor modification to the optimisation problem in Definition 3. Given any neuron that is not activated by the test suite , we find the input with the smallest distance to any input in that activates . We replace the constraint in Equation (4) with

(19)

The optimisation problem now searches for new inputs that activate the neuron , and the objective is to minimise the distance from the current set of test inputs .

We compare our tool with other state-of-the-art test case generation methods, including DeepConcolic666Our optimisation algorithm has been also adopted in the testing tool DeepConcolic, see https://github.com/TrustAI/DeepConcolic [28] and DeepXplore [20]. All results are averaged over 10 runs or more. Table 1 gives the neuron coverage obtained by the three tools. We observe that yields much higher neuron coverage than both DeepConcolic and DeepXplore in any of its three modes of operation (‘light’, ‘occlusion’, and ‘blackout’). Fig. 7 depicts adversarial examples generated due to norm change by our tool on test case generation. We also observe that a significant portion of the adversarial examples can be found using a relatively small distance. More experimental results can be found in Appendix 0.H. Overall, our tool offers an efficient approach to coverage-driven testing on DNNs.

Moreover, our tool can be used to guide the design of robust DNN architectures, as shown in Case Study Four (see Appendix 0.D). In Case Study Five, we show that can also generate saliency map for model interpretability and is capable of evaluating local robustness for large-scale, state-of-the-art ImageNet DNN models including VGG-16/19, ResNet-50/101 and AlexNet (see Appendix 0.E).

6 Related Work

6.1 Generation of Adversarial Examples

Existing algorithms compute an upper bound of the maximum safety radius. However, they cannot guarantee to reach the maximum safety radius, while our method is able to produce both lower and upper bounds that provably converge to the maximum safety radius. Most existing algorithms first compute a gradient (either a cost gradient or a forward gradient) and then perturb the input in different ways along the most promising direction on that gradient. FGSM (Fast Gradient Sign Method) [5] is for the norm. It computes the gradient . JSMA (Jacobian Saliency Map based Attack) [18] is for the

norm. It calculates the Jacobian matrix of the output of a DNN (in the logit layer) with respect to the input. Then it iteratively modifies one or two pixels until a misclassification occurs. The

C&W Attack (Carlini and Wagner) [2] works for the , and norms. It formulates the search for an adversarial example as an image distance minimisation problem. The basic idea is to introduce a new optimisation variable to avoid box constraints (image pixels need to lie within ). DeepFool [16] works for the norm. It iteratively linearises the network around the input and moves across the boundary by a minimal step until reaching a misclassification. VAT (Visual Adversarial Training) [15] defines a KL-divergence at an input based on the model’s robustness to the local perturbation of the input, and then perturbs the input according to this KL-divergence. We focus on the  norm. We have shown experimentally that for this norm, our approach dominates all existing approaches. We obtain tighter upper bounds at lower computational cost.

6.2 Safety Verification and Reachability Analysis

The approaches aim to not only find an upper bound but also provide guarantees on the obtained bound. There are two ways of achieving safety verification for DNNs. The first is to reduce the problem to a constraint solving problem. Notable works include, e.g., [21, 9]. However, they can only work with small networks that have hundreds of hidden neurons. The second is to discretise the vector spaces of the input or hidden layers, and then apply exhaustive search algorithms or Monte-Carlo tree search algorithm on the discretised spaces. The guarantees are achieved by establishing local assumptions such as minimality of manipulations in [8] and minimum confidence gap for Lipschitz networks in [31, 32]. Moreover, [12] considers determining if an output value of a DNN is reachable from a given input subspace, and reduces the problem to a MILP problem; and [3] considers the range of output values from a given input subspace. Both approaches can only work with small networks. We also mention [19], which computes a lower bound of local robustness for the norm by propagating relations between layers backward from the output. It is incomparable with ours because of the different distance metrics. The bound is loose and cannot be improved (i.e., no convergence). Recently, some researchers use abstract interpretation to verify the correctness of DNNs [4, 14]. Its basic idea is to use abstract domains (represented as e.g., boxes, zonotopes, polyhedra) to over-approximate the computation of a set of inputs. In recent work [6] the input vector space is partitioned using clustering and then the method of [9] is used to check the individual partitions. DeepGO [22, 23] shows that most known layers of DNNs are Lipschitz continuous and presents a verification approach based on global optimisation.

However, none of the verification tools above are workable on -norm distance in terms of providing the anytime and guaranteed convergence to the true global robustness. Thus, the proposed tool, , is a supplementary to existing research on safety verification of DNNs.

7 Conclusions

In this paper, to evaluate global robustness of a DNN over a testing dataset, we present an approach to iteratively generate its lower and upper bounds. We show that the bounds are gradually, and strictly, improved and eventually converge to the optimal value. The method is anytime, tensor-based, and offers provable guarantees. We conduct experiments on a set of challenging problems to validate our approach.

References

Appendix 0.A Appendix: Proofs of Theorems

0.a.1 Proof of NP-hardness

Theorem 5

Let be a neural network and its input is normalized into . When is the norm, is NP-hard, and there at least exists a deterministic algorithm that can compute in time complexity for the worst case scenario when the error tolerance for each dimension is .

Proof 2

Here we consider the worst case scenario and use a straight-forward grid search to verify the time complexity needed. In the worst case, the maximum radius of a safe -norm ball for DNN is . A grid search with grid size starts from to verify whether is the radius of maximum safe -norm ball and would require the following running time in terms of evaluation numbers of the DNN.

(20)

From the above proof, we get the following remark.

Remark 1

Computing is more challenging problem for , since it requires a higher computing complexity than and . Namely, grid search only requires evaluations on DNN to estimate or given the same error tolerance .

0.a.2 Proof of Theorem: Guarantee of Lower Bounds

Proof 3

Our proof proceeds by contradiction. Let . Assume that there is another adversarial example such that where represents the number of perturbed pixels. By the definition of adversarial examples, there exists a subspace such that . By , we can find a subspace such that . Thus we have . Moreover, by , we have since . However, this conflicts with , which can be obtained by the algorithm for computing lower bounds in Section 4.2.

0.a.3 Proof of Theorem: Guarantee of Upper Bounds

Proof 4 (Monotonic Decrease Property of Upper Bounds)

We use mathematical induction to prove that upper bounds monotonically decrease.

Base Case : Based on the algorithm in Upper Bounds of Section 4.2, we assume that, after subspace perturbations, we find the adversarial example such that .

We know that, at , based on the algorithm, we get and , the ordered subspace sensitivities and their corresponding subspaces. Assume that the ordered subspace list is . Then, from the assumption, we have where denotes the input of the neural network corresponding to subspace .

Then, at , according to the algorithm, we calculate and . Similarly, we assume the ordered subspace list is . Thus we can find a subspace in such that . As a result, we know that , thus . After exhaustive tightening, we can at least find its subset