Efficient Exact Verification of Binarized Neural Networks

05/07/2020
by   Kai Jia, et al.
0

Concerned with the reliability of neural networks, researchers have developed verification techniques to prove their robustness. Most verifiers work with real-valued networks. Unfortunately, the exact (complete and sound) verifiers face scalability challenges and provide no correctness guarantees due to floating point errors. We argue that Binarized Neural Networks (BNNs) provide comparable robustness and allow exact and significantly more efficient verification. We present a new system, EEV, for efficient and exact verification of BNNs. EEV consists of two parts: (i) a novel SAT solver that speeds up BNN verification by natively handling the reified cardinality constraints arising in BNN encodings; and (ii) strategies to train solver-friendly robust BNNs by inducing balanced layer-wise sparsity and low cardinality bounds, and adaptively cancelling the gradients. We demonstrate the effectiveness of EEV by presenting the first exact verification results for L-inf-bounded adversarial robustness of nontrivial convolutional BNNs on the MNIST and CIFAR10 datasets. Compared to exact verification of real-valued networks of the same architectures on the same tasks, EEV verifies BNNs hundreds to thousands of times faster, while delivering comparable verifiable accuracy in most cases.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

03/06/2020

Exploiting Verified Neural Networks via Floating Point Numerical Error

We show how to construct adversarial examples for neural networks with e...
12/15/2020

Scalable Verification of Quantized Neural Networks (Technical Report)

Formal verification of neural networks is an active topic of research, a...
07/20/2020

Neural Network Robustness Verification on GPUs

Certifying the robustness of neural networks against adversarial attacks...
09/09/2018

Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability

We explore the concept of co-design in the context of neural network ver...
02/23/2019

A Convex Relaxation Barrier to Tight Robustness Verification of Neural Networks

Verification of neural networks enables us to gauge their robustness aga...
10/09/2017

Verification of Binarized Neural Networks

We study the problem of formal verification of Binarized Neural Networks...
02/23/2019

A Convex Relaxation Barrier to Tight Robust Verification of Neural Networks

Verification of neural networks enables us to gauge their robustness aga...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep learning has achieved impressive success in many application fields including image understanding, speech recognition, natural language processing, and game playing (Goodfellow et al., 2016). While the intrinsic complexity of deep neural networks (DNNs) enables them to learn difficult tasks, this complexity also hinders the understanding of their behavior. Moreover, the existence of adversarial examples (Szegedy et al., 2014) directly exposes the fragility of DNNs. Such fragility raises concerns for applying DNNs in safety-critical environments such as autonomous driving or aircraft control.

We present new techniques and a new system, EEV, for exact verification of binarized neural networks (BNNs). EEV incorporates novel codesigned SAT solver and training strategies. We deploy EEV to verify adversarial robustness against input perturbations bounded by the norm. Compared to the fastest previously existing exact verification methods for this task, including methods for either binarized or real-valued DNNs, our results show that, for our set of MNIST and CIFAR10 benchmarks, our techniques improve the verification performance by a factor of between ten to ten thousand times depending on the dataset and network architecture. This paper makes the following contributions:

Mean Solve Time (s) PGD Accuracy Verifiable Accuracy
MNIST EEV 0.0009 95.35% 84.46%
Xiao et al. 0.49 95.13% 94.33%
MNIST EEV 0.0023 90.97% 36.41%
Xiao et al. 2.78 92.05% 80.68%
CIFAR10 EEV 0.0019 39.47% 13.48%
Xiao et al. 13.50 49.92% 45.93%
CIFAR10 EEV 0.0017 26.78% 10.79%
Xiao et al. 22.33 26.78% 20.27%
  • EEV is exact verification with EEV. Xiao et al. is exact verification for real-valued networks, with data taken from (Xiao et al., 2019). Both use the conv-small network architecture (binarized for EEV). See Table 4 for more results. While PGD accuracy is comparable, verifiable accuracy is significantly lower for EEV, reflecting the unavailability of a robust training algorithm for BNNs. See Section 6.2 for more discussion.

Table 1: Preview of Results for MNIST and CIFAR10
  1. We incorporate native support for reified cardinality constraints into a SAT solver, improving the performance of BNN verification by more than a factor of one hundred compared to an unmodified SAT solver (Section 4.4).

  2. We identify that sparse weights induced by ternarization (Narodytska et al., 2020) cause unbalanced sparsity between layers of convolutional networks. While ternarization achieves sufficient overall sparsity, our results show that it also induces high verification complexity. We propose a new strategy (BinMask), which produces more balanced sparsity. Our results show that BinMask improves the performance of our verification system by a factor of between one hundred to ten thousand times compared to its performance on ternarized networks (Section 5.1).

  3. We further reduce verification complexity by introducing a cardinality bound decay regularizer with a tunable tradeoff between accuracy and solving time, leading to an additional speedup of up to a factor of thousands of times (Section 5.2).

  4. We present the first exact verification of -bounded adversarial robustness of convolutional BNNs on CIFAR10 (Table 1).

  5. We present experimental results comparing EEV against the best previously existing exact robustness verification systems (for either binarized or real-valued networks). These results show that, for the MNIST and CFAR10 benchmarks, our system verifies exact network robustness of given inputs between ten to ten thousand times faster than these previous systems.

2 Background and Related Work

We formulate the problem of DNN exact verification (a.k.a. complete verification) as checking whether a DNN satisfies given properties, for which the answer should either be guaranteed satisfaction or a counterexample that violates the properties. Researchers have developed a range of techniques for verifying various properties of DNNs, mostly for real-valued ReLU networks. They are largely based on SMT solvers 

(Scheibler et al., 2015; Huang et al., 2017; Katz et al., 2017; Ehlers, 2017)

or Mixed Integer Linear Programming (MILP) 

(Lomuscio and Maganti, 2017; Cheng et al., 2017; Fischetti and Jo, 2018; Dutta et al., 2018; Tjeng et al., 2019; Yang and Rinard, 2019). Another line of research delivers guaranteed robustness via incomplete verification (a.k.a. certification) that may fail to prove or disprove the desired properties in certain cases (Wong and Kolter, 2017; Weng et al., 2018; Gehr et al., 2018; Zhang et al., 2018; Raghunathan et al., 2018; Dvijotham et al., 2018; Mirman et al., 2018; Singh et al., 2019). This research often explores the idea of over-approximation to improve scalability, where the verifier considers a relaxed form of the actual computation in a DNN. In this paper we focus on exact verification.

Binarized neural networks (BNNs) (Hubara et al., 2016) constrain activations and weights to be binary, resulting in significant speed gain and energy saving during inference (Hubara et al., 2016; Rastegari et al., 2016; Moss et al., 2017) with tolerable accuracy degeneration (Darabi et al., 2018). Moreover, binarization facilitates analysis because the combinatorial nature of BNNs enables close interaction with logical reasoning, allowing a rich set of properties to be encoded in conjunctive normal form (CNF). Examples include queries on adversarial robustness, trojan attacks, fairness, network equivalence, and model counting (Narodytska et al., 2018; Baluta et al., 2019)

. The exact SAT encoding of BNNs is quite straightforward, compared to MILP methods which usually need to estimate the bounds of hidden neurons during verification for a given input. Moreover, it has been shown that exact verification of real-valued neural networks suffers from numerical error present in both the verifier and the inference implementation that allows adversarial examples to be constructed for networks with verified robustness 

(Jia and Rinard, 2020), while a BNN satisfies the verified properties on any correct inference implementation. Analysis techniques for BNNs include efficient encoding (Shih et al., 2019) and exploiting decomposability between neurons or layers (Cheng et al., 2018; Khalil et al., 2019).

Adversarial attack and defense of DNNs is a developing field where most research focuses on real-valued networks (Carlini and Wagner, 2017; Athalye et al., 2018; Madry et al., 2018; Kannan et al., 2018; Tramer et al., 2020). BNNs can also be attacked by gradient-based adversaries (Galloway et al., 2018) or specialized solving algorithms (Khalil et al., 2019).

Until recently, exact verification of DNNs was too computationally expensive to scale beyond a few hundred neurons. (Tjeng et al., 2019)

present the first exact verification result for convolutional neural networks (CNNs) on MNIST by tightening the MILP formulation. A subsequent improvement induces stability of ReLU neurons during training

(Xiao et al., 2019). (Narodytska et al., 2020)

verify a nontrivial binarized multilayer perceptron on MNIST.

3 Preliminaries

3.1 The Boolean Satisfiability Problem (SAT)

SAT is the problem of deciding whether there exists a satisfying variable assignment for a given Boolean expression (Biere et al., 2009). We consider Boolean expressions in conjunctive normal form (CNF) defined over a set of Boolean variables . A CNF is a conjunction of a set of clauses where each clause is a disjunction of some literals , and a literal is either a variable or its negation: or .

Despite the well known fact that 3-SAT is NP-complete (Cook, 1971)

, efficient heuristics have been developed to enable SAT solvers to scale to industrial problems 

(Balyo et al., 2017).

3.2 Binarized Neural Networks (BNNs)

Binarization of neural networks is a special case of network quantization, proposed as a method to reduce the computation burden and speed up inference and possibly also training (Rastegari et al., 2016; Zhou et al., 2016; Jacob et al., 2018). We follow the framework of (Hubara et al., 2016), but modify the activation values from to .

The basic building block of a BNN is a linear-BatchNorm-binarize

operation that maps an input tensor

to an output tensor with a weight parameter :

(1)

Note that the use of rather than

for activations does not impact network capacity, because it is a linear transformation on the activations and can be cancelled by the following batch normalization. Besides simplifying the conversion to SAT formulas, using a

encoding also makes zero padding for convolutional layers trivial.

Although the function has zero gradient almost everywhere, we can still train a BNN using gradient based optimizers by adopting the straight-through estimator (Bengio et al., 2013) which treats the

function as an identity function during backpropagation.

First layer:

The first layer of a BNN is usually applied on float or 8-bit fixed point inputs since it has many fewer channels and would not be a major issue for performance. However encoding floating-point or integer arithmetics in SAT incurs high complexity, and we add an extra quantization layer to process the input:

(2)

where is the real valued input, is the quantized input to be fed into the BNN, and is the quantization step size which can be set to for emulating 8-bit fixed point values, or for adversarial training with norm bounded by .

Last layer:

We consider the layer before softmax as the last layer of the BNN, whose output can be interpreted as classification score. We remove the in (1)

to obtain a real valued score. To enable direct conversion into SAT, we also restrict the running variance and scale parameter in

of the last layer to be a scalar computed on the whole feature map rather than per-channel statistics.

4 Combinatorial Analysis of BNNs

4.1 Encoding BNNs with Reified Cardinality Constraints

We discuss techniques for encoding a trained BNN as a SAT formula, focusing on the details of encoding a single layer. During inference the Batch Normalization becomes a linear transformation (Ioffe and Szegedy, 2015):

With being a fixed parameter, we can rewrite (1) as the following, where is the layer input and is the layer output with interpreted as FALSE and interpreted as TRUE:

(3)

To convert into a Boolean expression, we consider the simple case of dot-product , which can be easily extended to handle convolutional or fully connected layers. If , we have ; and if , we rewrite . Therefore

Now (3) can be rewritten as a reified cardinality constraint, where acts as or according to the sign of and can be rounded to an integer accordingly:

(4)

Cardinality constraints belong to a more general class called pseudo-Boolean constraints, which allow literals to be multiplied by integer coefficients. They are usually converted to CNF formulas by encoders such as sequential counters (Sinz, 2005; Hölldobler et al., 2012) or binary decision diagrams (Abío et al., 2011). Rather than using a standard SAT solver on the encoded formula, we extend the SAT solver to handle such constraints natively.

4.2 Encoding for Adversarial Attacks

Input Perturbation Encoding:

We discuss how to constrain the solver space within a bound around the given input . We will focus on a single scalar in the input tensor for simplicity, but it easily applies to the whole tensor under norm. Recall that we quantize an input with step size in (2), allowing us to rewrite the first layer as

This formulation suggests that we can treat as the coefficient for Batch Normalization in (4) during inference so that the input is an integer. For adversarial attacks on with , we encode the attack space as where and are the bounds of allowed input, is the possible range, and are Boolean variables whose sum corresponds to the value of adversarial inputs. We further restrict the search space by enforcing the thermometer encoding (Buckman et al., 2018) on , via adding additional clauses for .

Untargeted Attack Encoding:

Assume there are output classes and is the target class, such that the adversary tries to cause the network to output a classification other than . Let and denote the input and output of the last layer respectively. Similar to the analysis in Section 4.1, we can rewrite in the form . Note that we have required to be a scalar in the last layer. To ensure that the network makes a wrong prediction, we add a clause where is a decision variable indicating whether the confidence of class exceeds that of class by a margin of . Note that is also a reified cardinality constraint, except that the weight on some may be up to , which can be handled by duplicating the literal. We set in our experiments.

Given a BNN and an input image, a formula can be obtained by encoding the input constraints, the BNN itself and the output constraints using the techniques outlined above. If a SAT solver finds a solution for the formula, then an adversarial input can be recovered from the solution. Otherwise the network is proven to be robust for this input.

4.3 Extending the CDCL Algorithm

Modern SAT solvers typically utilize the conflict-driven clause learning (CDCL) algorithm (Marques-Silva et al., 2009), which tries to reduce the search space by learning new clauses from conflicts. There are three key procedures in this algorithm:

  1. Branching: Pick an undecided variable and assign a value to it. The order of branching is usually decided by heuristics like VSIDS (Moskewicz et al., 2001).

  2. Propagation: Given current branching and propagation decisions, try to infer values of undecided variables. Such inference is based on the crucial concept of unit clause that contains only one unassigned literal: if there is a clause in the clause database and are all known to be false, then must be true so the whole clause could be satisfied.

  3. Clause Learning: When a conflict is encountered, a new clause is constructed and inserted into the clause database by summarizing the reasons that lead to the conflict. The learning is performed on the implication graph, whose nodes correspond to assignments of variables. For a node , it has incoming edges from such that they are the conditions to imply (i.e., there is a clause containing exactly and all literals corresponding to are false in the clause given the assignment ). Branching variables have no incoming edges in the graph. Starting from a special node representing the conflict, the graph is traversed in reverse order to enumerate all branching variables that lead to the conflict. Disjunction of negation of those variables are added to the set of learned clauses.

The propagation and clause learning processes can be generalized to handle clauses not in disjunctive form, as long as each clause permits inferring values of undecided variables. This idea has been explored in the literature to extend SAT solvers to domain-specific problems (Soos et al., 2009; Liffiton and Maglalang, 2012; Ganesh et al., 2012).

Given a reified cardinality constraint , there are two types of propagations:

  • Operand-inferring: If is known and enough of the are known, then the remaining can be inferred. For example, if is known to be true and there are already literals in known to be true, then the other literals must be false.

  • Target-inferring: If enough of the are known, then can be inferred. For example, if the number of false literals in reaches , then can be inferred to be true.

4.4 MiniSatCS: An Efficient Implementation

Figure 1: Performance comparison of SAT solvers. The running times are collected by applying the solver to find adversarial inputs for the MNIST-MLP network described in Section 6. Our MNIST-MLP network achieves 97.4%test accuracy with parameters, compared to 95.2%accuracy with parameters reported by (Narodytska et al., 2020). We evaluate MinisatCS, MiniSat and Z3 on a fixed random subset containing 40 test examples, and also present MinisatCS-full for the performance evaluated on the complete MNIST test set.

We present MiniSatCS, an novel system with native support for reified cardinality constraints. MiniSatCS is based on MiniSat 2.2 (Eén and Sörensson, 2003). An important design in modern SAT solvers is the use of watched literals (Moskewicz et al., 2001), which allows rapid detection of unit clauses. In the case of reified cardinality constraints, we keep similar watchers for each variable, and maintain counters for the current number of known true or false literals for each clause, so the situations that allow propagation can be detected without scanning the whole clause every time a variable changes. We use random polarity and turned off phase saving (Pipatsrisawat and Darwiche, 2007) in the solver since it is faster for BNN verification.

The SAT formula encoding of a BNN is constant for different input values, in contrast with MILP-base methods that need to estimate the bounds of hidden neurons for each input. Therefore we have designed a model cache mechanism in MinisatCS to reuse the set of formulas corresponding to the BNN for different test cases, reducing model build time by ten times for the large networks.

We compare the performance of MinisatCS on the MNIST-MLP network against two other solvers: the unmodified MiniSat 2.2 using a sequential counter encoder for reified cardinality constraints, and an SMT solver Z3 (De Moura and Bjørner, 2008) that has native pseudo-Boolean logic support. From Figure 1 we can see that our system is significantly more efficient than previous ones on verification of BNNs, and hundreds of times faster than the prior state-of-the-art result on the same task (Narodytska et al., 2020).

5 Training Solver-friendly BNNs

5.1 BinMask: Balanced Weight Sparsifying

MNIST CIFAR10
Mean Solve Time (s) Ternary 964.8776 381.3879
BinMask 0.0004 0.3082
Max Solve Time (s) Ternary 3600.017 3600.028
BinMask 0.004 2.870
Test Accuracy Ternary 97.56% 52.76%
BinMask 97.53% 50.09%
Total Sparsity Ternary 80% 86%
BinMask 83% 80%
Layer-wise Sparsity Ternary 16% 38% 83% 30% 14% 45% 88% 30%
BinMask 93% 89% 83% 92% 95% 91% 79% 93%
Verifiable Accuracy Ternary 0% 0%
BinMask 90% 0%
  • We applied the MiniSatCS solver to search adversarial inputs for an undefended small-conv network with under a time limit of 3600 seconds. Input quantization step is . Solver time is evaluated on a fixed random subset containing examples from the test set. For Ternary network we set and applied regularization of and for MNIST and CIFAR10 respectively, to achieve similar total sparsity with BinMask. More details on experimental settings are described in Section 6.

Table 2: Comparing BinMask and Ternary Weight

It has been observed that sparse weights facilitate verification of neural networks (Tjeng et al., 2019; Xiao et al., 2019; Narodytska et al., 2020). A common sparsifying method (Narodytska et al., 2020) for BNNs is to use ternary weights, i.e. setting when . However this technique suffers from two drawbacks: The threshold and the penalty coefficient for regularization are two coupled parameters that need tuning For convolutional networks, the sparsity of convolutional layers is usually lower than that of fully connected layers, which has also been observed during pruning real-valued networks (Han et al., 2015). Such unbalanced sparsity complicates verification because convolutional layers bear most of the computation burden. In this case their low sparsity reduces the verification speedup. While it is possible to prune each layer with a fixed rate and retrain the network iteratively (Frankle and Carbin, 2019), such methods are especially costly when we consider adversarial training. To this end, we hypothesize that this unbalancing in BNNs is caused by a uniform setting of with coupled optimization of both weight sparsity and weight values. The zero in the ternary weights creates a gap between and , requiring the weight to go through the zero zone even when a sign change suffices. Since the convolutional and fully connected layers have different dynamics during training but is not tuned layer-wise, their sparsity diverges as a result. Therefore we propose to decouple weight value and weight sparsity by introducing a binary mask to be applied on the weights. More formally, for each weight we introduce a new mask weight that is to be optimized independently of and define

(5)
We call this method BinMask and summarize an empirical comparison with ternary weights in Table 2. Although these two methods achieve similar total sparsities and similar accuracies, sparsity of individual layers is more balanced in BinMask, and consequently its verification is thousands of times faster.

5.2 Cardinality Bound Decay

MNIST CIFAR10
Mean Solve Time (s) 2200.503 1332.398 0.318 3.343 3.642 0.048
Max Solve Time (s) 3600.014 3600.014 7.595 93.188 54.192 0.127
Test Accuracy 99.01% 98.73% 97.05% 53.51% 48.17% 42.35%
Provable Accuracy (Timeout%) 5% (60%) 5% (30%) 25% (0%) 0% (0%) 0% (0%) 2% (0%)
Mean / Max Cardinality Bound 235.4 / 561.6 8.1 / 61.9 3.4 / 11.6 334.7 / 771.4 14.5 / 66.7 5.0 / 33.9
First Layer / Total Sparsity 84% / 72% 84% / 83% 88% / 88% 94% / 68% 94% / 86% 95% / 81%
  • These results are obtained from large-conv networks trained adversarially with for MNIST and for CIFAR10. As decreases, the solve times decrease significantly with decreasing but still comparable accuracy. We only evaluate these networks on a fixed random subset containing images from the test set due to limited computing resource, with a timeout of seconds. Details of adversarial training are described in Section 6.

Table 3: Effect of Cardinality Bound Decay

While BinMask alone sparsifies the small network enough to be efficiently verified, it is not sufficient for a larger network. To further reduce verification complexity, we revisit the reified cardinality constraint and note the following facts:

  1. If it is encoded into CNF using sequential counters (Sinz, 2005) by introducing auxiliary variables to encode whether , then variables and clauses are needed for the encoding. Thus smaller produces simpler encoding.

  2. MiniSatCS can infer to be false once the number of true literals in exceeds , and a smaller increases the likelihood of this inference.

  3. If the literals

    are drawn from independent Bernoulli distribution parameterized with probability

    , then the entropy of is a symmetrical concave function with respect to maximized when . Therefore the further deviates from , the more predictable becomes.

We are thus motivated to regularize the bound in reified cardinality constraints to reduce verification complexity. We propose a Cardinality Bound Decay (CBD) loss to achieve this goal, by adding an penalty of strength on the bias term in (4). We also introduce a parameter so that bounds below do not get penalized, and set in all of our experiments. Meaningful setting of should be non-negative because if drops below zero, then becomes constantly true or false and the bound should not be penalized anyway. The CBD loss term is formally defined as:

(6)

It is worth noting that since is equivalent to , we only consider the value of rather than in this loss. Table 3 summarizes our empirical evaluation of the performance of CBD loss. Our proposed method effectively reduces the bounds in cardinality constraints and speeds up verification significantly, and the parameter can be tuned to control the tradeoff between accuracy and verification speed. Notably although CBD also introduces weight sparsity, it is not an alternative for weight sparsifying, which can be observed from the last two experiments on CIFAR10.

6 Experiments

6.1 Experimental Environment and Methods

We conduct our experiments on a workstation equipped with two GPUs (NVIDIA Titan RTX and NVIDIA GeForce RTX 2070 SUPER), 128 GiB of RAM and an AMD Ryzen Threadripper 2970WX 24-core processor. We used the PyTorch 

(Paszke et al., 2019) framework to train all the networks. We evaluated our methods using adversarial attack as an example task on two datasets: MNIST (LeCun et al., 1998) and CIFAR10 (Krizhevsky et al., 2009). Unless stated otherwise, we limit the execution time to seconds for the MiniSatCS solver per input image as in (Xiao et al., 2019).

Network Architecture:

We adopt three network architectures from the literature for the evaluation of EEV:

  1. MNIST-MLP: This is a binarized multilayer perceptron with hidden layers having units (Narodytska et al., 2020) . It is trained with an input quantization step and sparsified by BinMask.

  2. Small-conv: This is a network with two convolutional layers of and channels, followed by two fully connected layers with and units. The convolutional layers have filters and stride with a padding of . The architecture is the same as in (Xiao et al., 2019) except that we binarize the network.

  3. Large-conv: This is a network extending the small-conv, where each convolutional layer is preceded by another convolution with a padding of . The convolutional layers have channels and there are three fully connected layers with output units. The architecture is the same as in (Xiao et al., 2019) except that we binarize the network.

Training Method

We train the networks using the Adam optimizer (Kingma and Ba, 2014) for epochs with a minibatch size of . Due to fluctuations of test accuracy between epochs, we select from the last five epochs the model having the highest accuracy on the first training minibatches. Learning rate is initially and decayed by a factor of two for the last epochs. We use projected gradient descent (PGD) to generate adversarial examples for robust training as in (Madry et al., 2018), where is increased linearly from to the desired value in the first epochs and the number of PGD iteration steps grows linearly from to in the first

epochs. All weights are initialized from a Gaussian distribution with standard deviation

, and the mask weights in (5) are enforced to be positive by taking the absolute value at initialization. We apply a weight decay of on binarized in all experiments. For training on the MNIST dataset the input quantization step is set to be . We set for CIFAR10. These input quantization steps are slightly greater than twice the largest perturbation bound we consider for each dataset. The CBD loss is applied on large-conv networks only and is set to be for MNIST and for CIFAR10. We do not use any data augmentation techniques for training. Due to limited computing resource and significant differences between the settings we considered, data in this paper are reported based on one evaluation run.

6.2 Evaluating Adversarial Robustness

We evaluate the performance of EEV on the MNIST and CIFAR10 benchmarks. We train MNIST-MLP on MNIST only for comparison against (Narodytska et al., 2020), the previous fastest exactly verified BNN on MNIST. Figure 1 presents the results. For the other two network architectures, we train an undefended network on natural images and two robust networks against the PGD adversary with different bounds on both MNIST and CIFAR10. We evaluate the robustness of each network against two adversaries, a 100-step PGD and our exact verifier, with three settings of the bound. We also compare our results with the state-of-the-art exact verifier for real-valued networks (Xiao et al., 2019). Table 4 presents detailed results of verifier performance and test accuracy, showing that our verifier exhibits solving times 16.13 to 12815.62 times faster than (Xiao et al., 2019).

We highlight an interesting observation. An undefended CIFAR10 network has only 0.06% PGD accuracy on the largest bound we considered, and adversarial training improves the number to 26.78%, comparable with real-valued networks. However if we evaluate the true adversarial robustness by applying our exact verifier, the undefended network totally fails (0.00% adversarially robust) while the seemingly robust network achieves only 10.79% adversarial accuracy. This suggests that first-order adversaries like PGD may be insufficient to explore the adversarial space of BNNs for robust training. We remark that the gap between PGD accuracy and verifiable accuracy is unlikely to be caused by obfuscated gradients (Athalye et al., 2018) because success rate of PGD attack is higher when perturbation bound is increased and PGD training does improve PGD accuracy significantly, suggesting that gradient information is still useful for attacks we use the straight-through-estimator to compute gradients of the activation binarization and input quantization functions, which is the same method for training the networks, and gradients are unlikely to be shattered in this way.

6.3 Extensibility Case Study

Dataset Method Mean Solve Time Solver Timeout Mean Build+Solve Time
Training
MNIST conv-small 0.0023 0.2428 0.2464 0% 0.07% 0.07% 0.0179 0.2599 0.2795
MNIST conv-small 0.0009 0.0107 0.1594 0% 0% 0.05% 0.0169 0.0397 0.1756
Xiao et al. S 0.49 - - 0.05% - - 5.47 - -
MNIST conv-small 0.0003 0.0008 0.0023 0% 0% 0% 0.0311 0.0301 0.0170
Xiao et al. S - - 2.78 - - 1% - - 7.12
conv-large 0.0052 0.1786 2.3221 0% 0.04% 1% 0.1145 0.2910 2.4458
Xiao et al. L - - 37.45 - - 24% - - 203.84
CIFAR10 conv-small 0.0038 0.0054 0.0075 0% 0% 0% 0.0468 0.0555 0.0610
CIFAR10 conv-small 0.0019 0.0035 0.0033 0% 0% 0% 0.0433 0.0530 0.0540
Xiao et al. S 13.50 - - 2% - - 66.08 - -
CIFAR10 conv-small 0.0012 0.0018 0.0017 0% 0% 0% 0.0406 0.0484 0.0476
Xiao et al. S - - 22.33 - - 2% - - 60.67
conv-large 0.0899 0.0326 0.0566 0.04% 0% 0% 0.3397 0.2615 0.3521
Xiao et al. L - - 20.14 - - 5% - - 421.86
Dataset Method Test Accuracy PGD Adversarial Accuracy Verifiable Adversarial Accuracy
Training
MNIST conv-small 97.06% 92.98% 84.81% 67.57% 75.29% 26.17% 2.55%
MNIST conv-small 97.16% 95.35% 92.82% 86.57% 84.46% 48.71% 11.69%
Xiao et al. S 98.68% 95.13% - - 94.33% - -
MNIST conv-small 95.53% 94.61% 93.37% 90.97% 87.08% 68.28% 36.41%
Xiao et al. S 97.33% - - 92.05% - - 80.68%
conv-large 97.05% 96.22% 95.07% 92.40% 88.24% 62.38% 21.52%
Xiao et al. L 97.54% - - 93.25% - - 59.60%
CIFAR10 conv-small 51.67% 15.13% 0.83% 0.06% 0.36% 0.02% 0.00%
CIFAR10 conv-small 46.57% 39.47% 27.79% 19.00% 13.48% 1.63% 0.26%
Xiao et al. S 61.12% 49.92% - - 45.93% - -
CIFAR10 conv-small 33.18% 31.63% 29.03% 26.78% 24.75% 15.91% 10.79%
Xiao et al. S 40.45% - - 26.78% - - 20.27%
conv-large 42.35% 40.44% 37.70% 34.30% 3.86% 0.37% 0.15%
Xiao et al. L 42.81% - - 28.69% - - 19.80%
  • We present results of our system on BNNs and compare against real-valued networks of the same architecture in (Xiao et al., 2019). “Xiao et al. S” and “Xiao et al. L” correspond to real-valued networks conv-small and conv-large respectively, with data taken from Xiao et al.. We conduct a complete evaluation of the large architecture while Xiao et al. evaluate only the first 1000 images due to long build time. We set for MNIST and for CIFAR10.

Table 4: Verification Time and Test Accuracy on MNIST and CIFAR10 Benchmarks
Test Accuracy Mean Solve Time (s) Attack Success Rate
small-conv 95.53% 0.002 63.59%
large-conv 97.05% 2.322 78.48%
ensemble 94.26% 2.330 47.44%
Table 5: Model Ensemble with Reject Option on MNIST

We evaluate the extensibility of our system by considering an ensemble of models that rejects the input if they do not fully agree on the classification. We are interested in how easily this ensemble can be attacked by requiring the adversary to cause all of the components to output the same wrong classification. The goal can be easily formulated in CNF: Let be the number of classes, be the correct class and denote whether score of class is higher than in model as defined in Section 4.2. Let denote whether class has the highest score by model and denote whether all models agree on class . Then the attack goal is simply . Such an encoding would not be so straightforward if we were working with other formulations such as MILP. We present in Table 5 the results of an ensemble of small-conv and large-conv networks on MNIST adversarially trained and tested with bound . It shows that our system can easily handle more complex queries.

6.4 Validating Training Methods

We conduct comprehensive experiments to validate that our proposed training methods significantly reduce verification complexity without sacrificing much test accuracy.

Network Architecture Training Method Solver Test Accuracy Mean Solve Time Median Solve Time Timeout Verifiable Accuracy Overall Sparsity
0 conv-small Ternary MiniSatCS 97.28% 212.084 2.587 3% 0% 79%
BinMask MiniSatCS 97.06% 0.001 0.001 0% 75% 86%
MiniSat 2.2 97.06% 0.944 0.536 0% 75% 86%
z3 97.06% 0.079 0.078 0% 75% 86%
conv-large Ternary MiniSatCS 98.89% 2732.030 3600.004 72% 0% 82%
Ternary+CBD MiniSatCS 97.50% 1190.475 166.054 28% 0% 60%
BinMask MiniSatCS 99.01% 1259.135 169.778 28% 52% 81%
BinMask +CBD MiniSatCS 97.84% 7.195 0.221 0% 38% 87%
MiniSat 2.2 97.84% 979.900 125.950 18% 32% 87%
z3 97.84% 1155.247 39.132 25% 22% 87%
0.3 conv-small Ternary MiniSatCS 96.18% 704.245 0.376 18% 12% 73%
BinMask MiniSatCS 95.53% 0.002 0.002 0% 32% 91%
MiniSat 2.2 95.53% 0.519 0.377 0% 32% 91%
z3 95.53% 0.062 0.061 0% 32% 91%
conv-large Ternary MiniSatCS 98.35% 2811.341 3600.003 78% 0% 87%
BinMask MiniSatCS 99.01% 2200.503 3600.002 60% 5% 72%
BinMask +CBD MiniSatCS 97.05% 0.318 0.021 0% 25% 88%
MiniSat 2.2 97.05% 159.477 6.248 3% 25% 88%
z3 97.05% 219.558 1.482 0% 25% 88%
  • All the methods are evaluated on a fixed subset containing 40 randomly sampled examples from the MNIST test set. Time limit is 3600 seconds.

Table 6: Comparison of Methods on MNIST Subset
Network Architecture Training Method Solver Test Accuracy Mean Solve Time Median Solve Time Timeout Verifiable Accuracy Overall Sparsity
0 conv-small Ternary MiniSatCS 53.72% 64.366 0.009 0% 0% 83%
MiniSat 2.2 53.72% 352.748 93.883 3% 0% 83%
z3 53.72% 1612.839 870.972 25% 0% 83%
BinMask MiniSatCS 51.67% 0.003 0.003 0% 2% 81%
MiniSat 2.2 51.67% 6.262 5.601 0% 2% 81%
z3 51.67% 0.186 0.174 0% 2% 81%
conv-large Ternary MiniSatCS 66.15% 774.398 2.506 18% 0% 89%
BinMask MiniSatCS 65.85% 235.996 0.171 3% 0% 91%
BinMask +CBD MiniSatCS 65.15% 19.585 0.095 0% 0% 93%
MiniSat 2.2 65.15% 865.934 378.570 5% 0% 93%
z3 65.15% 3600.195 3600.190 100% 0% 93%
conv-small Ternary MiniSatCS 35.88% 0.305 0.005 0% 0% 89%
BinMask MiniSatCS 33.18% 0.002 0.002 0% 8% 88%
MiniSat 2.2 33.18% 0.718 0.710 0% 8% 88%
z3 33.18% 0.071 0.070 0% 8% 88%
conv-large Ternary MiniSatCS 39.27% 112.048 0.359 3% 0% 82%
BinMask MiniSatCS 53.51% 3.343 0.111 0% 0% 68%
BinMask +CBD MiniSatCS 42.35% 0.048 0.044 0% 2% 81%
MiniSat 2.2 42.35% 13.175 9.460 0% 2% 81%
z3 42.35% 152.297 26.479 0% 2% 81%
  • All the methods are evaluated on a fixed subset containing 40 randomly sampled examples from the CIFAR10 test set. Time limit is 3600 seconds.

Table 7: Comparison of Methods on CIFAR10 Subset
Dataset Network Architecture Training Method Test Accuracy Mean Solve Time Median Solve Time Timeout Verifiable Accuracy Overall Sparsity
MNIST 0 conv-small BinMask 97.06% 0.002 0.001 0.00% 75.29% 86%
conv-large BinMask+CBD 97.84% 9.661 0.216 5.09% 39.27% 87%
MNIST 0.3 conv-small BinMask 95.53% 0.002 0.002 0.00% 36.41% 91%
conv-large BinMask+CBD 97.05% 2.322 0.028 1.11% 21.52% 88%
CIFAR10 0 conv-small BinMask 51.67% 0.004 0.003 0.00% 0.36% 81%
conv-large BinMask+CBD 65.15% 14.336 1.088 6.64% 0.00% 93%
CIFAR10 conv-small BinMask 33.18% 0.002 0.002 0.00% 10.79% 88%
conv-large BinMask+CBD 42.35% 0.057 0.043 0.00% 0.15% 81%
  • All the methods are evaluated on the complete test sets. Time limit is 120 seconds.

Table 8: MiniSatCS Results on Full Dataset

For each dataset, we train the conv-small and conv-large networks under two training settings: undefended (i.e., ) and PGD-based adversarial training with a large perturbation bound ( for MNIST and for CFAR10). The undefended network can be regarded as a reference of test accuracy for the architecture under specific sparsity. The results show that on all the data sets and adversarial training perturbation bounds that we have considered, our proposed solver MiniSatCS is consistently faster than MiniSat 2.2 and Z3, reaching a speedup by a factor of between 5.48 to 500.76 times compared to the fastest of the other two. Our proposed training methods, BinMask and Cardinality Bound Decay (CBD), work together to significantly reduce verification time at the cost of some small degradation of test accuracy. Specifically, compared to ternary weights, BinMask with CBD delivers verification speedup by a factor of between 39.54 to 406490.81 times, with accuracy degradation less than 2% for MNIST and less than 3% for CIFAR10. A positive side effect of BinMask is that it improves robustness, possibly due to improved sparsity of convolutional layers. Note that CBD works better with BinMask, and combining ternary weights with CBD results in worse accuracy and longer verification time of the conv-large architecture on MNIST.

Because of the relatively long verification times for some of the cases with other solvers or ternary weights, we evaluate all the methods on a fixed subset containing randomly sampled examples from the complete test set and summarize the results in Table 6 and Table 7 for MNIST and CIFAR10 respectively. We evaluate MiniSat 2.2 and Z3 on the easiest-to-verify models that are trained with BinMask or BinMask with CBD, but we also run the two solvers on a small ternary weight model for CIFAR10, to show that BinMask also benefits other solvers and MiniSatCS is also more efficient in the ternary weight case. In fact CBD also helps the other solvers, because when we try to verify the seemingly easiest-to-verify conv-large network trained with only BinMask (i.e., the one adversarially trained on CIFAR10), MiniSat 2.2 fails due to out of memory error, and z3 exceeds the one hour time limit. We also present the corresponding results for MiniSatCS evaluated on the complete MNIST and CFAR10 test sets in Table 8. For undefended networks (i.e., trained with ), we verify the robustness with for MNIST and for CIFAR10. For adversarially trained networks we verify the robustness with the same perturbation bound for training (i.e., for MNIST and for CFAR10).

7 Conclusion

In this work we demonstrate that it is possible to significantly scale up the exact verification of binarized neural networks (BNNs) by equipping an off-the-shelf SAT solver with domain-specific propagation rules and simultaneously training solver-friendly BNNs. Although we focus on verifying adversarial robustness, our method could be generalized to verify other properties of BNNs. Our experimental results demonstrate the significant performance increases that our techniques deliver.

References

  • I. Abío, R. Nieuwenhuis, A. Oliveras, and E. Rodríguez-Carbonell (2011) BDDs for pseudo-boolean constraints–revisited. In International Conference on Theory and Applications of Satisfiability Testing, pp. 61–75. Cited by: §4.1.
  • A. Athalye, N. Carlini, and D. Wagner (2018) Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In

    Proceedings of the 35th International Conference on Machine Learning

    , J. Dy and A. Krause (Eds.),
    Proceedings of Machine Learning Research, Vol. 80, Stockholmsmässan, Stockholm Sweden, pp. 274–283. Cited by: §2, §6.2.
  • T. Baluta, S. Shen, S. Shinde, K. S. Meel, and P. Saxena (2019) Quantitative verification of neural networks and its security applications. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 1249–1264. Cited by: §2.
  • T. Balyo, M. J. Heule, and M. Jarvisalo (2017) SAT competition 2016: recent developments. In

    Thirty-First AAAI Conference on Artificial Intelligence

    ,
    Cited by: §3.1.
  • Y. Bengio, N. Léonard, and A. Courville (2013) Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432. Cited by: §3.2.
  • A. Biere, M. Heule, and H. van Maaren (2009) Handbook of satisfiability. Vol. 185, IOS press. Cited by: §3.1.
  • J. Buckman, A. Roy, C. Raffel, and I. Goodfellow (2018) Thermometer encoding: one hot way to resist adversarial examples. In International Conference on Learning Representations, Cited by: §4.2.
  • N. Carlini and D. Wagner (2017) Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pp. 39–57. Cited by: §2.
  • C. Cheng, G. Nührenberg, C. Huang, and H. Ruess (2018) Verification of binarized neural networks via inter-neuron factoring. In Working Conference on Verified Software: Theories, Tools, and Experiments, pp. 279–290. Cited by: §2.
  • C. Cheng, G. Nührenberg, and H. Ruess (2017) Maximum resilience of artificial neural networks. In International Symposium on Automated Technology for Verification and Analysis, pp. 251–268. Cited by: §2.
  • S. A. Cook (1971) The complexity of theorem-proving procedures. In

    Proceedings of the third annual ACM symposium on Theory of computing

    ,
    pp. 151–158. Cited by: §3.1.
  • S. Darabi, M. Belbahri, M. Courbariaux, and V. P. Nia (2018) BNN+: improved binary network training. arXiv preprint arXiv:1812.11800. Cited by: §2.
  • L. De Moura and N. Bjørner (2008) Z3: an efficient smt solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems, pp. 337–340. Cited by: §4.4.
  • S. Dutta, S. Jha, S. Sankaranarayanan, and A. Tiwari (2018) Output range analysis for deep feedforward neural networks. In NASA Formal Methods Symposium, pp. 121–138. Cited by: §2.
  • K. Dvijotham, S. Gowal, R. Stanforth, R. Arandjelovic, B. O’Donoghue, J. Uesato, and P. Kohli (2018) Training verified learners with learned verifiers. arXiv preprint arXiv:1805.10265. Cited by: §2.
  • N. Eén and N. Sörensson (2003) An extensible SAT-solver. In International conference on theory and applications of satisfiability testing, pp. 502–518. Cited by: §4.4.
  • R. Ehlers (2017)

    Formal verification of piece-wise linear feed-forward neural networks

    .
    In International Symposium on Automated Technology for Verification and Analysis, pp. 269–286. Cited by: §2.
  • M. Fischetti and J. Jo (2018) Deep neural networks and mixed integer linear optimization. Constraints 23 (3), pp. 296–309. Cited by: §2.
  • J. Frankle and M. Carbin (2019) The lottery ticket hypothesis: finding sparse, trainable neural networks. In International Conference on Learning Representations, Cited by: item 2.
  • A. Galloway, G. W. Taylor, and M. Moussa (2018) Attacking binarized neural networks. In International Conference on Learning Representations, External Links: Link Cited by: §2.
  • V. Ganesh, C. W. O’donnell, M. Soos, S. Devadas, M. C. Rinard, and A. Solar-Lezama (2012) Lynx: a programmatic SAT solver for the RNA-folding problem. In International Conference on Theory and Applications of Satisfiability Testing, pp. 143–156. Cited by: §4.3.
  • T. Gehr, M. Mirman, D. Drachsler-Cohen, P. Tsankov, S. Chaudhuri, and M. Vechev (2018) Ai2: safety and robustness certification of neural networks with abstract interpretation. In 2018 IEEE Symposium on Security and Privacy (SP), pp. 3–18. Cited by: §2.
  • I. Goodfellow, Y. Bengio, and A. Courville (2016) Deep learning. MIT Press. Note: http://www.deeplearningbook.org Cited by: §1.
  • S. Han, J. Pool, J. Tran, and W. Dally (2015) Learning both weights and connections for efficient neural network. In Advances in neural information processing systems, pp. 1135–1143. Cited by: item 2.
  • S. Hölldobler, N. Manthey, and P. Steinke (2012) A compact encoding of pseudo-boolean constraints into SAT. In Annual Conference on Artificial Intelligence, pp. 107–118. Cited by: §4.1.
  • X. Huang, M. Kwiatkowska, S. Wang, and M. Wu (2017) Safety verification of deep neural networks. In International Conference on Computer Aided Verification, pp. 3–29. Cited by: §2.
  • I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio (2016) Binarized neural networks. In Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.), pp. 4107–4115. Cited by: §2, §3.2.
  • S. Ioffe and C. Szegedy (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167. Cited by: §4.1.
  • B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko (2018) Quantization and training of neural networks for efficient integer-arithmetic-only inference. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    ,
    pp. 2704–2713. Cited by: §3.2.
  • K. Jia and M. Rinard (2020) Exploiting verified neural networks via floating point numerical error. arXiv preprint arXiv:2003.03021. Cited by: §2.
  • H. Kannan, A. Kurakin, and I. Goodfellow (2018)

    Adversarial logit pairing

    .
    arXiv preprint arXiv:1803.06373. Cited by: §2.
  • G. Katz, C. Barrett, D. L. Dill, K. Julian, and M. J. Kochenderfer (2017) Reluplex: an efficient smt solver for verifying deep neural networks. In International Conference on Computer Aided Verification, pp. 97–117. Cited by: §2.
  • E. B. Khalil, A. Gupta, and B. Dilkina (2019) Combinatorial attacks on binarized neural networks. In International Conference on Learning Representations, Cited by: §2, §2.
  • D. P. Kingma and J. Ba (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. Cited by: §6.1.
  • A. Krizhevsky, G. Hinton, et al. (2009) Learning multiple layers of features from tiny images. Cited by: §6.1.
  • Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE 86 (11), pp. 2278–2324. Cited by: §6.1.
  • M. H. Liffiton and J. C. Maglalang (2012) A cardinality solver: more expressive constraints for free. In International Conference on Theory and Applications of Satisfiability Testing, pp. 485–486. Cited by: §4.3.
  • A. Lomuscio and L. Maganti (2017) An approach to reachability analysis for feed-forward relu neural networks. arXiv preprint arXiv:1706.07351. Cited by: §2.
  • A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu (2018) Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, Cited by: §2, §6.1.
  • J. Marques-Silva, I. Lynce, and S. Malik (2009) Conflict-driven clause learning SAT solvers. In Handbook of satisfiability, pp. 131–153. Cited by: §4.3.
  • M. Mirman, T. Gehr, and M. Vechev (2018) Differentiable abstract interpretation for provably robust neural networks. In Proceedings of the 35th International Conference on Machine Learning, J. Dy and A. Krause (Eds.), Proceedings of Machine Learning Research, Vol. 80, Stockholmsmässan, Stockholm Sweden, pp. 3578–3586. Cited by: §2.
  • M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and S. Malik (2001) Chaff: engineering an efficient SAT solver. In Proceedings of the 38th annual Design Automation Conference, pp. 530–535. Cited by: item 1, §4.4.
  • D. J. Moss, E. Nurvitadhi, J. Sim, A. Mishra, D. Marr, S. Subhaschandra, and P. H. Leong (2017) High performance binary neural networks on the Xeon+FPGA™platform. In 2017 27th International Conference on Field Programmable Logic and Applications (FPL), pp. 1–4. Cited by: §2.
  • N. Narodytska, S. Kasiviswanathan, L. Ryzhyk, M. Sagiv, and T. Walsh (2018) Verifying properties of binarized deep neural networks. In Thirty-Second AAAI Conference on Artificial Intelligence, Cited by: §2.
  • N. Narodytska, H. Zhang, A. Gupta, and T. Walsh (2020) In search for a SAT-friendly binarized neural network architecture. In International Conference on Learning Representations, Cited by: item 2, §2, Figure 1, §4.4, §5.1, item 1, §6.2.
  • A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala (2019) PyTorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), pp. 8024–8035. Cited by: §6.1.
  • K. Pipatsrisawat and A. Darwiche (2007) A lightweight component caching scheme for satisfiability solvers. In International conference on theory and applications of satisfiability testing, pp. 294–299. Cited by: §4.4.
  • A. Raghunathan, J. Steinhardt, and P. S. Liang (2018) Semidefinite relaxations for certifying robustness to adversarial examples. In Advances in Neural Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.), pp. 10877–10887. Cited by: §2.
  • M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi (2016) Xnor-net: imagenet classification using binary convolutional neural networks. In European conference on computer vision, pp. 525–542. Cited by: §2, §3.2.
  • K. Scheibler, L. Winterer, R. Wimmer, and B. Becker (2015) Towards verification of artificial neural networks. In MBMV, pp. 30–40. Cited by: §2.
  • A. Shih, A. Darwiche, and A. Choi (2019) Verifying binarized neural networks by angluin-style learning. In International Conference on Theory and Applications of Satisfiability Testing, pp. 354–370. Cited by: §2.
  • G. Singh, T. Gehr, M. Püschel, and M. Vechev (2019) An abstract domain for certifying neural networks. Proc. ACM Program. Lang. 3 (POPL). External Links: Document Cited by: §2.
  • C. Sinz (2005) Towards an optimal cnf encoding of boolean cardinality constraints. In International conference on principles and practice of constraint programming, pp. 827–831. Cited by: §4.1, item 1.
  • M. Soos, K. Nohl, and C. Castelluccia (2009) Extending SAT solvers to cryptographic problems. In Theory and Applications of Satisfiability Testing - SAT 2009, 12th International Conference, SAT 2009, Swansea, UK, June 30 - July 3, 2009. Proceedings, pp. 244–257. External Links: Document Cited by: §4.3.
  • C. Szegedy, W. Zaremba, I. Sutskever, J. B. Estrach, D. Erhan, I. Goodfellow, and R. Fergus (2014) Intriguing properties of neural networks. In 2nd International Conference on Learning Representations, ICLR 2014, Cited by: §1.
  • V. Tjeng, K. Y. Xiao, and R. Tedrake (2019) Evaluating robustness of neural networks with mixed integer programming. In International Conference on Learning Representations, Cited by: §2, §2, §5.1.
  • F. Tramer, N. Carlini, W. Brendel, and A. Madry (2020) On adaptive attacks to adversarial example defenses. arXiv preprint arXiv:2002.08347. Cited by: §2.
  • L. Weng, H. Zhang, H. Chen, Z. Song, C. Hsieh, L. Daniel, D. Boning, and I. Dhillon (2018) Towards fast computation of certified robustness for ReLU networks. In International Conference on Machine Learning, pp. 5276–5285. Cited by: §2.
  • E. Wong and J. Z. Kolter (2017) Provable defenses against adversarial examples via the convex outer adversarial polytope. arXiv preprint arXiv:1711.00851. Cited by: §2.
  • K. Y. Xiao, V. Tjeng, N. M. (. Shafiullah, and A. Madry (2019) Training for faster adversarial robustness verification via inducing reLU stability. In International Conference on Learning Representations, Cited by: 1st item, Table 1, §2, §5.1, item 2, item 3, 1st item, §6.1, §6.2, Table 4.
  • Y. Yang and M. Rinard (2019) Correctness verification of neural networks. arXiv preprint arXiv:1906.01030. Cited by: §2.
  • H. Zhang, T. Weng, P. Chen, C. Hsieh, and L. Daniel (2018)

    Efficient neural network robustness certification with general activation functions

    .
    In Advances in Neural Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.), pp. 4939–4948. Cited by: §2.
  • S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, and Y. Zou (2016) Dorefa-net: training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160. Cited by: §3.2.