Fast Geometric Projections for Local Robustness Certification

02/12/2020
by   Aymeric Fromherz, et al.
0

Local robustness ensures that a model classifies all inputs within an ϵ-ball consistently, which precludes various forms of adversarial inputs. In this paper, we present a fast procedure for checking local robustness in feed-forward neural networks with piecewise linear activation functions. The key insight is that such networks partition the input space into a polyhedral complex such that the network is linear inside each polyhedral region; hence, a systematic search for decision boundaries within the regions around a given input is sufficient for assessing robustness. Crucially, we show how these regions can be analyzed using geometric projections instead of expensive constraint solving, thus admitting an efficient, highly-parallel GPU implementation at the price of incompleteness, which can be addressed by falling back on prior approaches. Empirically, we find that incompleteness is not often an issue, and that our method performs one to two orders of magnitude faster than existing robustness-certification techniques based on constraint solving.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

07/23/2020

Hierarchical Verification for Adversarial Robustness

We introduce a new framework for the exact point-wise ℓ_p robustness ver...
02/23/2021

Non-Singular Adversarial Robustness of Neural Networks

Adversarial robustness has become an emerging challenge for neural netwo...
10/29/2020

Overcoming The Limitations of Neural Networks in Composite-Pattern Learning with Architopes

The effectiveness of neural networks in solving complex problems is well...
01/04/2020

Empirical Studies on the Properties of Linear Regions in Deep Neural Networks

A deep neural network (DNN) with piecewise linear activations can partit...
05/17/2021

How to Explain Neural Networks: A perspective of data space division

Interpretability of intelligent algorithms represented by deep learning ...
10/29/2019

Learning Without Loss

We explore a new approach for training neural networks where all loss fu...
03/20/2019

Provable Certificates for Adversarial Examples: Fitting a Ball in the Union of Polytopes

We propose a novel method for computing exact pointwise robustness of de...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We consider the problem of verifying the local robustness of piecewise-linear neural networks for a given bound. Precisely, given a point, , network, , and norm bound, , this entails determining whether Equation 1 holds.

(1)

This problem carries practical significance, as such networks have been extensively shown to be vulnerable to adversarial examples (Szegedy et al., 2014; Papernot et al., 2016), wherein small-norm perturbations are chosen to cause aribtrary misclassifications.

Numerous solutions have been proposed to address variants of this problem. These can be roughly categorized into three groups: learning rules that aim for robustness on known training data (Madry et al., 2018; Wong & Kolter, 2018), post-processing methods that provide stochastic guarantees at inference time (Lecuyer et al., 2018; Cohen et al., 2019), and verification routines that attempt to answer Equation 1 (or a close analog of it) exactly (Katz et al., 2017, 2019; Jordan et al., 2019; Tjeng & Tedrake, 2017; Fischetti & Jo, 2018; Cheng et al., 2017) or via conservative over-approximation (Weng et al., 2018; Singh et al., 2019b; Balunovic et al., 2019; Ehlers, 2017; Wang et al., 2018; Dutta et al., 2018). While approximate verification methods have shown promise in scaling to larger networks, they may introduce an additional penalty to robust accuracy by flagging non-adversarial points, thus limiting their application in practice. Exact methods impose no such penalty, but as they rely on expensive constraint-solving techniques, they often do not scale to even moderately-sized networks.

In this paper, we present a verification technique for Equation 1

that neither relies on expensive constraint solving nor conservative over-approximation. We build on the observation that in Euclidean space, the angle between two random vectors of equal norm is with high probability

 (Blum et al., 2020, Chapter 2)

. In other words, random vectors in high dimension are likely to be almost orthogonal, so the distance from a point to a hyperplane segment can often be approximated well by the distance from the point to its projection onto the corresponding hyperplane.

Our algorithm (Section 2) leverages this insight to exhaustively search the model’s decision boundaries around a point using only geometric projections. The performance benefits of this approach are substantial, especially in the case of robustness, since Euclidean projections can be efficiently computed using the dot product and accelerated on GPU hardware. Our algorithm is embarassingly parallel, and straightforward to implement with facilities for batching that are available in many popular libraries (Section 3). Additionally, we show how the algorithm can be easily modified to find certified lower bounds for , rather than verifying a given fixed value (Section 2.3).

Because our algorithm relies exclusively on local projections, it may encounter scenarios in which there is evidence to suggest non-robust behavior, but the network’s exact boundaries cannot be conclusively determined without accounting for global constraints (Section 2, Figure 1). In such cases, the algorithm must either return , or fall back on constraint solving. However, we prove that if the algorithm terminates with a decision, then the model satisfies Equation 1, and likewise if it returns , then an adversarial example exists (Section 2.2). Note that unlike prior work on approximate verification, our approach can often separate cases from , providing a concrete adversarial example in the former. In this sense, the algorithm can be characterized as sound but incomplete.

Finally, our empirical evaluation (Section 4) demonstrates that for several models trained on MNIST and Fashion-MNIST, the algorithm returns only a small fraction of the time on randomly-chosen test instances. Moreover, we show that our implementation outperforms existing exact techniques (Jordan et al., 2019; Katz et al., 2019) by up to two orders of magnitude (Section 4, Tables 1 & 2), while rarely being inconclusive on instances for which other techniques do not time out. Additionally, we show that while our implementation is not as fast as previous work on efficient lower-bound computation for large models (Weng et al., 2018), our certified lower bounds are consistently tighter, and in some cases minimal (Section 4, Table 3).

2 Algorithm

In this section we give a high-level overview of our proposed algorithm (Section 2.1), and argue for its correctness (Section 2.2). We also propose a variant (Section 2.3) to compute certified lower bounds on the minimal adversarial distortion. We note that both the algorithm and the correctness argument are general and apply to arbitrary norms, and so for the remainder of this section we use the un-subscripted notation to refer to a general norm. However, in our implementation (Section 3), we focus on local robustness for the norm. We do this for the following two reasons: (1) the algorithm relies on projections that can be computed efficiently in Euclidean spaces, and (2) there are few existing tools that can efficiently guarantee local robustness with respect to , as they are usually limited to processing linear computations (see Section 5).

2.1 The Algorithm

The algorithm we propose (Algorithm 1) takes a model, , an input, , and a bound, , and either proves that Equation 1 is satisfied (i.e., -local robustness), finds an adversarial input at distance less than from , or returns . Our algorithm assumes , where

is the function computed by a neural network composed of linear transformations with ReLU activations.

The algorithm relies on the analysis of all possible activation regions around . An activation region is a maximal set of inputs such that all elements have the same activation pattern.

Formally, let

denote the pre-activation value of neuron

in network when evaluating . We say that neuron is activated if . An activation pattern, , is a Boolean function over neurons that characterizes whether each neuron is activated. Then the activation region, , associated with pattern is the set of inputs that realize :

(2)

Because we assume that is a piecewise-linear composition of linear transformations and ReLU activations, we can associate the activation status of any neuron with a closed half-space of  (Jordan et al., 2019). The activation constraint, , for neuron and pattern is the linear inequality , where and satisfy Equation 3.

(3)

The intersection of these constraints yields the activation region , and the facets of correspond to the non-redundant constraints. The convexity of activation regions follows from this observation, as does the fact that the decision boundaries are also linear constraints of the form for classes and . More details on computing these constraints are given in Section 3.1.

  Input: model , data , bound
  Initialize ,
   = activation_pattern()
   = [
   get_decision_boundaries(),
   get_activation_constrs()]
   = filter(, )
  enqueue(, )
   +=
  while  do
      = dequeue()
     if is_decision_boundary(then
          = projection(, )
         if  then
            return
         else
            return
     else
          = pattern_associated_with()
         if  then
            continue
         else
             = [
             get_decision_boundaries(),
             get_activation_constrs()]
             = filter(, )
            enqueue()
             +=
  return
Algorithm 1 Checking Local Robustness

Algorithm 1 performs a search of all activation regions that might be at distance less than from . Starting from the region associated with the activation pattern of the input, , we collect all decision boundaries and activation constraints, and enqueue those at distance less than from . Here we define the distance, , from to a constraint, , as . We then analyze each constraint in the queue until the queue is empty.

If an analyzed constraint, , is a decision boundary, we take the projection, , from onto this constraint. Intuitively, the projection, , is an input satisfying that is at minimal distance from , i.e., . If does not have the same class as according to (or lies directly on the decision boundary between classes), then is an adversarial example. If has the same class as , it means that the projection to the decision boundary is outside of the activation region we are currently analyzing; however, this is not sufficient to conclude that no point on the decision boundary is inside both the current activation region and the -ball (see Figure 1). Therefore, we return .

Figure 1: Three example cases showing a boundary constraint (red), an activation constraint (gray), and the corresponding activation region, (shaded area). The -ball surrounding is denoted with the dotted circle. In the case on the left, local robustness is not satisfied in , and , so we can return . In the center case, local robustness is not satisfied in , but so we cannot distinguish this case from the case on the right, in which local robustness is satisfied in , using only projections. Thus, in both the center and right cases we must return .

On the other hand, if an analyzed constraint, , is an activation constraint, it means that the neighboring activation region with neuron flipped may intersect with the -ball centered at . This means that, if we are currently searching in activation region, , given by activation pattern , then we must search the region, , given by activation pattern , where except when ; in other words, we search the region corresponding to the activation pattern that would be obtained by flipping the neuron corresponding to the constraint we are analyzing. As previously, we search by gathering all activation constraints and decision boundaries corresponding to , enqueuing the ones at distance less than from .

Exhausting the queue means that we did not find any adversarial examples in any activation region that might intersect with the -ball centered at . We therefore conclude that is -robust in this case.

Note that since there are finitely many activation regions and we only explore each region at most once, this algorithm terminates.

2.2 Correctness

In this section, we argue for the correctness of the proposed algorithm. We show that when Algorithm 1 returns , there exists an adversarial example, and when it returns , the model is locally robust at , with radius . However, the algorithm may also return , in which case we do not claim anything about the robustness of the model.

Theorem 1.

(1) When Algorithm 1 returns , there exists an adversarial example, , such that and . (2) When Algorithm 1 returns , for all such that , .

Proof.

In the first case, where Algorithm 1 returns , the proof of Theorem 1 is trivial: we exhibit a point, , such that , and for which .

The interesting case is when Algorithm 1 returns . We prove by contradiction that in this case, is in fact locally robust with radius .

Let us assume for the sake of contradiction that Algorithm 1 returns , but there exists a point, , such that and . Let and be the activation regions associated with and respectively.

We define a path of activation regions as a sequence, , of activation regions such that the underlying activation patterns and differ in exactly one neuron for all , and there exists at least one input, , that has activation pattern for all . For instance, in a network with three neurons, if , , and , and there exist inputs, , , and with activation patterns , , and , then , , is a path.

Our proof relies on three facts, that we prove hereafter:

  1. There exists a path, , from to where each region in the path contains at least one input at distance less than from , and either or all in the path are different.

  2. Our algorithm visits all regions in the path, .

  3. If a visited activation region contains an adversarial input, our algorithm either detects it, returning , or returns .

Together, (1), (2), and (3) imply that if an adversarial point, , exists, it resides in an activation region that would have been checked by Algorithm 1, which would have resulted in the algorithm returning or , contradicting the assumption that it returned .

(1) Existence of a Path

Consider the segment going from to in a straight line. As , all points on this segment are also at distance less than . As is a neural network with ReLU activations, is a continuous function, as are the activation functions, for each of its internal neurons, . Furthermore, input points at the boundary between two activation regions (i.e., , such that ) belong to both activation regions. Therefore, listing all activation regions on the segment between and yields a path, , from to , with each region on the path containing an input point, , on the segment, such that .

That each , in is unique follows from the convexity of activation regions: If there exist in such that there exists an in between, then cannot be convex, as there exists a line segment with its end points in and that is not entirely contained within the region . This ensures that paths are of finite length.

(2) Exploration by the Algorithm

Given the existence of the path, , from to , we now prove that Algorithm 1 would visit all activation regions in if it returns . We proceed by induction on the length of paths induced (similarly to as above) by a line segment included in the -ball centered on .

In the base case, if the path is of length one, then it contains only , and the claim holds trivially since Algorithm 1 starts by visiting .

In the inductive case, let us assume that for any path with length at most induced by a segment, , beginning at with , Algorithm 1 visits all regions in the path. Now consider a path, , of length , induced by a segment, , beginning at with . Since is on the path, there exists a point, , on such that the sub-segment from to induces a path of length ; thus we can apply our induction hypothesis to conclude that Algorithm 1 visits . Now, since and are neighbors in , they must share some boundary, , that is intersected by . Thus, since , ; thus when Algorithm 1 visits , it will add to the queue via . Therefore, since Algorithm 1 returns only when all regions in the queue have been visited, Algorithm 1 will visit , concluding the proof of (2).

(3) Detection of Adversarial Examples

We conclude by proving that if there exists an adversarial example in a region visited by Algorithm 1, then we either return or .

If in is an adversarial example, then . By continuity of , this means that there exists an input, on the segment from to that is exactly on a decision boundary, . As , must have been analyzed when exploring .

However, when analyzing a decision boundary, Algorithm 1 always returns either or .

Thus, the decision boundary in the activation region containing would have been analyzed by Algorithm 1; this yields a contradiction, as the algorithm must have returned or rather than . ∎

2.3 Certified Lower Bounds

We now consider the related problem of finding a lower bound on the local robustness of the network. A variant of Algorithm 1, given by Algorithm 2, can provide certified lower bounds by using a priority queue.

  Input: model , data , bound
  Initialize , ,
  …
  while  do
      = dequeue()
      = max(, )
     if is_decision_boundary(then
         return
     else
         …
  return
Algorithm 2 Certified Lower Bound on Local Robustness

In this variant, constraints are enqueued with a priority corresponding to their distance from , such that the closest constraint to is always at the front of the queue. We keep track of the current certified lower bound in a variable, . At each iteration, we set to the maximum of the old and the distance from to the constraint at the front of the queue.

The algorithm terminates either when all constraints at distance less than the initially specified were handled, or when a decision boundary for which the initial algorithm returns or is found. In the first case, we return ; in the second, we return the value stored in at this iteration.

The proof of this variant is similar to the proof of Algorithm 1. It relies on the following loop invariant.

Loop Invariant 1.

(1) At each iteration, all activation regions at distance less than from were previously visited, (2) is always smaller than , and (3) there is no adversarial point at distance less than .

Proof.

Loop Invariant 1 trivially holds on entry of the loop since is initialized to 0. Proving that the invariant is maintained is more interesting. Suppose that increases to . It must then be shown that there is no unvisited activation region at distance less than . We proceed again by contradiction: assume there was such a region, , containing a point, , such that . Again, let us consider the segment from to , and the path, , it induces. Let us consider , the first region of at distance greater than that was not previously visited. If no such region exists, then is at distance less than from , and so by our induction hypothesis, it was already visited. otherwise, was visited, and the activation constraint, , between and is such that . Therefore, (which leads to ) was already added to the queue with priority less than , and by virtue of the priority queue, it was explored before the current iteration, which yields a contradiction.

If does not increase, the invariant still trivially holds. This case can happen because our computation of the distance to constraints is an underapproximation of the true distance to the feasible portion of the constraint, as is illustrated in Figure 2. ∎

Figure 2: Illustration of a constraint, , whose projection-based distance, , is smaller than the true distance, , from to the feasible portion of in the shaded activation region.

3 Efficient Implementation

In this section, we describe the key insights that allow us to implement Algorithm 1 efficiently for the norm, and describe various optimizations we employ to improve its practicality. In Sections 3.1 and 3.2 we elaborate on the insights and implementation details relevant to computing constraints and approximating their distance from . Section 3.3

describes heuristics for reducing the number of

results Algorithm 1 returns, and Section 3.4 presents optimizations we implement that speed up our analysis.

3.1 Computing Activation Region Boundaries

Recall our setting in which we assume our model, , is given by , where is the function computed by a neural network consisting of layers of linear transformations with ReLU activations between layers. For a particular activation pattern, , we observe that we can replace each ReLU, in layer, , with an element-wise multiplication by a vector, , where the position in corresponding to neuron, , takes value 1 if is , and 0 otherwise. Thus, when ,

which we observe is a linear function with the same slope for all in . By the same argument, is a linear function, , for all in .

Now consider the activation constraint, , which is satisfied when . Given our above observation, we see that this is a linear constraint with coefficients, , and intercepts, . Note that the computation of these weights and intercepts does not depend on a particular point in — only on the activation pattern, . Thus, we can compute the boundaries of an activation region, , knowing only the corresponding activation pattern, .

In practice, the coefficients, , correspond to the gradient of with respect to its inputs, evaluated at a point in

. However, frameworks that perform automatic differentiation typically require a concrete point for evaluation of the gradient. Thus, we compute the gradient via backpropagation with the activation vectors,

. The intercepts, , are computed via a forward computation using the activation vectors, with set to 0. These operations can easily be implemented to run efficiently on a GPU.

Decision Boundaries

For a sigmoid network, there is at most a single decision boundary within any given activation region, which can be computed via gradient backpropagation as discussed above. In the case of a softmax model with multiple classes, we have an output for each class. A decision boundary for a point, , given class, , states that , for . Thus, for models with classes, the decision boundary consists of linear constraints derived from the difference of the model’s output at the selected class and the model’s output at each of the other classes.

3.2 Distances to Constraints

We observe that when working with the norm, we can efficiently compute projections using the dot product. Note that this computation can be trivially performed in parallel on many constraints at once by a GPU.

The distance from to its projection onto a hyperplane constraint, , gives a lower bound on the distance from to the part of that is feasible when the other constraints are taken into consideration. This allows us to soundly determine if a boundary constraint is further than from , implying local robustness, and is substantially faster than using constraint solving to discover the true distance. In our evaluation, we find that in high dimensions, the projected distance is an accurate enough lower bound to make Algorithm 1 practical.

3.3 Improving the Completeness of the Algorithm

The vanilla version of our algorithm presented in Section 2 stops and possibly returns at the first decision boundary encountered. We now present two heuristics to reduce the number of results returned, at the price of increased complexity. These heuristics are disabled by default in our implementation. We leave it to the user to enable them when deemed necessary.

Exploring the Full Queue

If the analysis of a decision boundary is inconclusive, Algorithm 1, as presented so far, stops and returns . We implement a variant of Algorithm 1 where we instead record the presence of an ambiguous decision boundary but continue exploring other regions in the queue. If we entirely empty the queue after unsuccessfully analyzing a decision boundary, we nevertheless return . However, oftentimes a true adversarial example is found during the continued search, allowing us to return conclusively.

The correctness of our algorithm still holds in this variant: an inconclusive analysis of a decision boundary leads to either returning , or exhibiting a true adversarial example found at a later stage.

Falling Back on LP Solving

Existing complete tools (Katz et al., 2017, 2019; Jordan et al., 2019; Tjeng & Tedrake, 2017; Fischetti & Jo, 2018; Cheng et al., 2017) often rely heavily on solvers to analyze models. When we cannot rule out adversarial examples in a specific activation region, we can query an LP solver, since both the activation constraints delimiting a region and the decision boundaries inside this region are linear constraints. This allows us to efficiently discharge many verification conditions through fast geometric projections while providing a complete, but slower, variant of our algorithm.

3.4 Speeding up the Analysis

In Section 2, Algorithm 1 is presented at a high level without regard to performance. In this section, we give an overview of the key optimizations we include in our implementation of Algorithm 1 that are independent of the high-level behavior of the algorithm.

Analyzing Decision Boundaries First

The correctness of Algorithm 1 is independent of how we iterate over the queue of constraints to analyze. Since our algorithm terminates as soon as we analyze a decision boundary, one useful optimization is thus to ensure that such constraints are always handled first. In our implementation of Algorithm 1, we use a queue data-structure consisting of two standard FIFO queues: one for decision boundaries and one for activation constraints. When dequeuing an element, we always try to dequeue a decision boundary first. This ensures that the most relevant constraints are handled first without harming the asymptotic complexity of our algorithm.

This optimization cannot be used when searching for a certified lower bound without breaking the correctness of our algorithm. It is thus only allowed when checking local robustness.

Batching

Most deep learning software is designed to easily enable efficient batch computations on a GPU. In particular, the gradient backpropagations for computing constraints (Section 

3.1) and the dot product projections for calculating distances (Section 3.2) lend themselves well to batching on the GPU. We leverage this in our implementation of Algorithm 1 by dequeuing multiple elements from the queue at once and calculating the constraints and constraint distances for the corresponding regions in parallel. We find that this optimization enables a speedup of up to 2x on large examples for a well-chosen batch size (10-100).

4 Evaluation

In this section, we evaluate the performance of our implementation of Algorithms 1 and 2 on several models adversarially trained (Madry et al., 2018) on MNIST and Fashion-MNIST. We also include a subset of MNIST (binary MNIST) in which the classification task is to distinguish between the digits “1” and “7,” as models trained on this dataset were used in the evaluation of GeoCert (Jordan et al., 2019). We selected 100 arbitrary instances from the test set of each dataset to perform our evaluation on.

Each model in our evaluation is named with the following convention: the name begins with an identifier for the dataset it was trained on, either “mnist_bin” (binary MNIST), “mnist”, or “fmnist” (Fashion-MNIST), followed by the sizes of the hidden layers. For example, the model, “fmnist40.40.40” was trained on Fashion-MNIST and has three hidden layers with 40 neurons each.

All experiments were run on a 4.2GHz Intel Core i7-7700K with 32 GB of RAM, and a Titan X GPU with 12 GB of RAM. Our implementation is written in Python using the Keras framework 

(Chollet et al., 2015)

with the TensorFlow backend 

(Abadi et al., 2015).

4.1 Local Robustness Certification

Algorithm 1, , GeoCert, , Marabou, ,
Model Time (s) Time (s) Time (s)
mnist_bin10.50.10 0.020 100 0 0 0 0.675 99 0 1 0.136 100 0 0
mnist_bin20.20 0.015 99 0 1 0 0.463 99 0 1 0.162 99 1 0
mnist20.20.20 0.029 87 6 6 1 4.839 72 2 26 1.875 95 5 0
mnist40.40.40 3.111 50 7 16 27 >60.0 13 12 75 4.344 96 4 0
fmnist20.20.20 0.035 89 4 7 0 5.916 84 8 8 3.669 93 7 0
fmnist40.40.40 1.528 65 3 9 23 >60.0 0 0 100 5.147 97 3 0
Table 1: Comparison of local robustness certification on robustly trained binary MNIST, MNIST, and Fashion-MNIST models. Execution times are the median over runs on 100 random test instances. For each of the 100 runs, results are either (), (), (), or a timeout ().
Vanilla, , Full queue, ,
Model Time (s) Time (s)
           mnist20.20.20 0.029 87 6 6 1 0.029 87 10 2 1
           mnist40.40.40 3.111 50 7 16 27 6.166 50 14 6 30
           fmnist20.20.20 0.035 89 4 7 0 0.036 89 9 2 0
           fmnist40.40.40 1.528 65 3 9 23 2.910 65 7 1 27
Table 2: Comparison of Algorithm 1 and its variant with full exploration of the constraints queue (Section 3.3). The times reported are the median over runs on 100 random test instances. For each of the 100 runs, results are either (), (), (), or a timeout ().
Model Algorithm 2, Mean Bound FastLin, Mean Bound Median Ratio
            fmnist 0.124 0.078 0.329
            fmnist 0.134 0.092 0.693
            fmnist 0.083 0.021 0.035
Table 3: Comparison of Algorithm 2 with FastLin. Results reported are over runs on 100 random test instances. The mean bound reported for Algorithm 2 is the lower bound returned after 60 seconds of computation. The ratio for a given instance is the bound obtained by FastLin divided by the bound obtained by Algorithm 2.

We first compare the efficiency of our implementation to that of other tools certifying local robustness. GeoCert (Jordan et al., 2019) is the tool most comparable to ours as it is able to exactly check for robustness with an norm. We also compare our implementation to the Marabou verification tool (Katz et al., 2019), an improvement over Reluplex (Katz et al., 2017), that does not handle the norm, but instead supports the norm. While these norms are not entirely commensurable, we perform the comparison by checking the largest ball contained in the ball of size . For MNIST models, this ball has size as the input space is of dimension 784. Note that as a consequence, we expect Marabou to label a higher fraction of the points as , as the search space we give it is significantly smaller.

We evaluate each of these tools with . Specifically, we compare the number of examples on which each tool terminates, the result of the analyses that terminate, and the median run time over the analyses of each of the selected 100 instances. We set a timeout of 20 seconds for the (smaller) binary MNIST models and of 60 seconds for the (larger) MNIST and Fashion-MNIST models. The results for these experiments are presented in Table 1.

We observe that our implementation always outperforms GeoCert by one to two orders of magnitude, without sacrificing precision — we rarely return when GeoCert terminates. While the comparison with Marabou’s verification is not directly comparable, it is worth noting that we are 1.4x-100x faster than Marabou despite the specified ball requiring analysis of a much smaller input space.

Table 2 displays the results of experiments evaluating the impact of exploring the whole queue of constraints (Section 3.3) instead of stopping at the first decision boundary as described in the vanilla version of Algorithm 1. We compare the performance of both variants on the larger networks for which the analysis of the vanilla implementation was occasionally inconclusive. Experimental results show that this heuristic decreases the number of results while only having a minor impact on the speed of the execution. Moreover, when enabling this heuristic, we reduce the number of results to the point that we are able to recover the results obtained by GeoCert in almost every instance for which GeoCert terminates, while nevertheless performing our analysis one to two orders of magnitude faster.

4.2 Certified Lower Bounds

We now evaluate the variant of our algorithm computing certified lower bounds on the minimum adversarial distortion (Algorithm 2 in Section 2.3). To this end, we compare the performance of our approach to the FastLin (Weng et al., 2018) tool, which is designed to provide quick lower bounds on the local robustness of ReLU networks.

FastLin is intended to scale to very large networks for which exact checks like GeoCert and Marabou are computationally infeasible. On these large networks (e.g., with thousands of neurons), FastLin is substantially faster than our implementation; however, the bounds it provides may be quite loose.

We compare the certified lower bound reported by our implementation of Algorithm 2 after 60 seconds of computation to the lower bound reported by FastLin on 100 instances arbitrarily picked from Fashion-MNIST; the results are presented in Table 3. The mean lower bound is reported for both methods, and we observe that on the models tested, Algorithm 2 is able to find a better lower bound on average, though it requires considerably more computation time.

Since the optimal bound may vary between instances, we also report the median ratio of the lower bounds obtained by the two methods on each individual instance. Here we see that FastLin may indeed be quite loose, as on a typical instance it achieves as low as 4% and only as high as 69% of the bound obtained by Algorithm 2.

Finally, we note that, when Algorithm 2 terminates by finding a decision boundary, if the projection onto that boundary is a true adversarial example, the lower bound is tight. In our experiments, there were few such examples — three on fmnist and one on fmnist — however, in these cases, the lower bound obtained by FastLin was very loose, achieving 4-15% of the optimal bound on fmnist, and only 0.8% of the optimal bound on fmnist. This suggests that while FastLin has been demonstrated to scale to large networks, one must be careful with its application, as there may be cases in which the bound it provides is too conservative.

5 Related Work

As mentioned, our work can be grouped with approaches for verifying neural network that aim to check local robustness exactly (Katz et al., 2019, 2017; Jordan et al., 2019); the main difference is that we aim to avoid expensive constraint solving at the price of incompleteness.

GeoCert (Jordan et al., 2019) is the closest work to ours; it aims to compute exact point-wise robustness of deep neural networks for convex norms. Unlike our approach, GeoCert computes the largest ball, centered at an input point, within which the network is robust. Our experimental comparison with GeoCert shows that our approach scales much better. This is not surprising as GeoCert relies on projections to facets (polytopes), which are solved by a quadratic program with linear constraints. In contrast, our approach uses projection to affine subspaces, which has a simpler, closed-form solution.

Reluplex (Katz et al., 2017) and its successor, Marabou (Katz et al., 2019), use SMT solving techniques to check local robustness and more general safety properties that can be expressed as linear constraints. Our experiments show that our approach performs better even in a setting that is advantageous to Reluplex/Marabou.

ERAN (Singh et al., 2019a), and its predecessor, AI2 (Gehr et al., 2018), rely on abstract interpretation (Cousot & Cousot, 1977) to automatically verify local robustness of neural networks against both and geometric input perturbations. As such, they use sound, conservative over-approximations to perform their analysis, which leads to false positives, i.e. robust inputs incorrectly classified as not robust. In comparison, all of the inputs flagged as by our approach have an adversarial example at distance less than .

FastLin (Weng et al., 2018) explores the special structure of ReLU networks to efficiently compute lower bounds on adversarial minimal distortions. This was expanded to general activation functions in a tool called CROWN (Zhang et al., 2018). Although FastLin has been shown to be very scalable, our experiments indicate that the computed bounds may be quite imprecise.

Recently, another quite different approach has been proposed for robustness certification. Randomized Smoothing (Cohen et al., 2019; Lecuyer et al., 2018) is a post-processing technique that provides a stochastic robustness guarantee at inference time. This approach differs from our approach in that it (1) modifies the predictions of the original model (increasing the complexity of making predictions), and (2) provides a probabilistic robustness guarantee that is quantified via a confidence bound. As such it provides an alternative set of costs and benefits as compared to our approach. Its complexity also differs from that of our approach, as it is dependent less on the architecture of the model, but rather on the number of samples required to perform its post-processing of the model’s output. We find that in our experimental setup, achieving the same probabilistic guarantee as the experiments described in (Cohen et al., 2019) (requiring samples), Randomized Smoothing takes approximately 4.5 seconds per instance to make a certified prediction on each of the models tested in Section 4.1. Thus, for these models, our implementation is on average faster or comparable in performance.

6 Conclusion and Future Work

In this paper, we have presented a new approach to verify the local robustness of networks with piecewise linear activation functions. Compared to previous work, our approach does not rely on constraint solving nor over-approximations but rather on geometric projections. While most existing tools focus on linear constraints, and therefore on and norms, we provide an efficient, highly parallel implementation to certify -robustness. Our implementation outperforms existing exact tools by up to two orders of magnitude on MNIST and Fashion-MNIST, while empirically maintaining the same precision under a time constraint.

Several possible improvements to our implementation are left as future work. The algorithm presented in Section 2 is norm-independent, but its performance for the norm relies on efficiently computing projections using a GPU. Extending our approach to other norms would require similar techniques, which already exist for the norm (Condat, 2015; Duchi et al., 2008)

for instance. As our implementation runs mostly on a GPU, we should also be able to more aggressively fall back on parallel executions of LP solvers, which run on CPUs, when our analysis is inconclusive. Finally, we plan to add support for convolutional neural networks.

Acknowledgments

The work described in this paper has been supported by the Software Engineering Institute under its FFRDC Contract No. FA8702-15-D-0002 with the U.S. Department of Defense, Bosch Corporation, an NVIDIA GPU grant, NSF Award CNS-1801391, a Google Faculty Fellowship, and the Alfred P. Sloan Foundation.

References

  • Abadi et al. (2015) Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X. TensorFlow: Large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/, 2015.
  • Balunovic et al. (2019) Balunovic, M., Baader, M., Singh, G., Gehr, T., and Vechev, M. Certifying geometric robustness of neural networks. In Advances in Neural Information Processing Systems (NIPS). 2019.
  • Blum et al. (2020) Blum, A., Hopcroft, J., and Kannan, R.

    Foundations of Data Science

    .
    Cambridge University Press, 2020.
  • Cheng et al. (2017) Cheng, C.-H., Nührenberg, G., and Ruess, H. Maximum resilience of artificial neural networks. In Automated Technology for Verification and Analysis (ATVA), 2017.
  • Chollet et al. (2015) Chollet, F. et al. Keras. https://keras.io, 2015.
  • Cohen et al. (2019) Cohen, J., Rosenfeld, E., and Kolter, Z. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning (ICML), 2019.
  • Condat (2015) Condat, L. Fast projection onto the simplex and the ball. Mathematical Programming, 158, 09 2015.
  • Cousot & Cousot (1977) Cousot, P. and Cousot, R. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Symposium on Principles of Programming Languages (POPL), 1977.
  • Duchi et al. (2008) Duchi, J., Shalev-Shwartz, S., Singer, Y., and Chandra, T. Efficient projections onto the -ball for learning in high dimensions. International Conference on Machine Learning (ICML), 2008.
  • Dutta et al. (2018) Dutta, S., Jha, S., Sankaranarayanan, S., and Tiwari, A. Output range analysis for deep feedforward neural networks. In NASA Formal Methods Symposium (NFM), 2018.
  • Ehlers (2017) Ehlers, R. Formal verification of piece-wise linear feed-forward neural networks. In Conference on Logic in Computer Science (LICS), 2017.
  • Fischetti & Jo (2018) Fischetti, M. and Jo, J. Deep neural networks and mixed integer linear optimization. Constraints, 23(3):296–309, Jul 2018.
  • Gehr et al. (2018) Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P., Chaudhuri, S., and Vechev, M. AI2: Safety and robustness certification of neural networks with abstract interpretation. In Symposium on Security and Privacy (S&P), 2018.
  • Jordan et al. (2019) Jordan, M., Lewis, J., and Dimakis, A. G. Provable certificates for adversarial examples: Fitting a ball in the union of polytopes. In Advances in Neural Information Processing Systems (NIPS), 2019.
  • Katz et al. (2017) Katz, G., Barrett, C. W., Dill, D. L., Julian, K., and Kochenderfer, M. J. Reluplex: An efficient SMT solver for verifying deep neural networks. In Computer Aided Verification (CAV), 2017.
  • Katz et al. (2019) Katz, G., Huang, D. A., Ibeling, D., Julian, K., Lazarus, C., Lim, R., Shah, P., Thakoor, S., Wu, H., Zeljić, A., Dill, D. L., Kochenderfer, M. J., and Barrett, C. The marabou framework for verification and analysis of deep neural networks. In Computer Aided Verification (CAV), 2019.
  • Lecuyer et al. (2018) Lecuyer, M., Atlidakis, V., Geambasu, R., Hsu, D., and Jana, S. Certified robustness to adversarial examples with differential privacy. In Symposium on Security and Privacy (S&P), 2018.
  • Madry et al. (2018) Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations (ICLR), 2018.
  • Papernot et al. (2016) Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z. B., and Swami, A. The limitations of deep learning in adversarial settings. In European Symposium on Security and Privacy (EuroS&P), 2016.
  • Singh et al. (2019a) Singh, G., Ganvir, R., Püschel, M., and Vechev, M. Beyond the single neuron convex barrier for neural network certification. In Advances in Neural Information Processing Systems (NIPS). 2019a.
  • Singh et al. (2019b) Singh, G., Gehr, T., Püschel, M., and Vechev, M. Robustness certification with refinement. In International Conference on Learning Representations (ICLR), 2019b.
  • Szegedy et al. (2014) Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I. J., and Fergus, R. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR), 2014.
  • Tjeng & Tedrake (2017) Tjeng, V. and Tedrake, R. Verifying neural networks with mixed integer programming. CoRR, abs/1711.07356, 2017.
  • Wang et al. (2018) Wang, S., Pei, K., Whitehouse, J., Yang, J., and Jana, S. Formal security analysis of neural networks using symbolic intervals. In USENIX Conference on Security Symposium, 2018.
  • Weng et al. (2018) Weng, L., Zhang, H., Chen, H., Song, Z., Hsieh, C.-J., Daniel, L., Boning, D., and Dhillon, I. Towards fast computation of certified robustness for ReLU networks. In International Conference on Machine Learning (ICML), 2018.
  • Wong & Kolter (2018) Wong, E. and Kolter, J. Z. Provable defenses against adversarial examples via the convex outer adversarial polytope. In International Conference on Machine Learning (ICML), 2018.
  • Zhang et al. (2018) Zhang, H., Weng, T.-W., Chen, P.-Y., Hsieh, C.-J., and Daniel, L. Efficient neural network robustness certification with general activation functions. In Advances in Neural Information Processing Systems (NIPS). 2018.