Towards Stable and Efficient Training of Verifiably Robust Neural Networks

06/14/2019 ∙ by Huan Zhang, et al. ∙ University of Michigan MIT University of Illinois at Urbana-Champaign 0

Training neural networks with verifiable robustness guarantees is challenging. Several existing successful approaches utilize relatively tight linear relaxation based bounds of neural network outputs, but they can slow down training by a factor of hundreds and over-regularize the network. Meanwhile, interval bound propagation (IBP) based training is efficient and significantly outperform linear relaxation based methods on some tasks, yet it suffers from stability issues since the bounds are much looser. In this paper, we first interpret IBP training as training an augmented network which computes non-linear bounds, thus explaining its good performance. We then propose a new certified adversarial training method, CROWN-IBP, by combining the fast IBP bounds in the forward pass and a tight linear relaxation based bound, CROWN, in the backward pass. The proposed method is computationally efficient and consistently outperforms IBP baselines on training verifiably robust neural networks. We conduct large scale experiments using 53 models on MNIST, Fashion-MNIST and CIFAR datasets. On MNIST with ϵ=0.3 and ϵ=0.4 (ℓ_∞ norm distortion) we achieve 7.46% and 12.96% verified error on test set, respectively, outperforming previous certified defense methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The success of deep learning networks has motivated their deployment in some safety-critical environments, such as autonomous driving and facial recognition systems. Applications in these areas make understanding the robustness and security of deep neural networks urgently needed, especially their resilience under malicious, finely crafted inputs. Unfortunately, deep learning models’ performance are often so brittle that even imperceptibly modified inputs, also known as adversarial examples, are able to completely break the model 

(Goodfellow et al., 2015; Szegedy et al., 2013). Deep learning models’ robustness under adversarial examples is well-studied from both attack (crafting powerful adversarial examples) and defence (making the model more robust) perspectives (Athalye et al., 2018; Carlini & Wagner, 2017a, b; Goodfellow et al., 2015; Madry et al., 2018; Papernot et al., 2016; Xiao et al., 2019a, 2018b, 2018c; Eykholt et al., 2018). Recently, it has been shown that defending deep learning models against adversarial examples is a very difficult task, especially under strong and adaptive attacks. Early defenses such as distillation (Papernot et al., 2016) have been broken by stronger attacks like C&W (Carlini & Wagner, 2017b). Many defense methods have been proposed recently (Guo et al., 2018; Song et al., 2017; Buckman et al., 2018; Ma et al., 2018; Samangouei et al., 2018; Xiao et al., 2018a), but their robustness improvement cannot be certified – no provable guarantees can be given to verify their robustness. In fact, most of these uncertified defenses are actually vulnerable under stronger attacks (Athalye et al., 2018; He et al., 2017).

There are, however, some methods in the literature seeking to give provable guarantees on the robustness performance, such as distributional robust optimization (Sinha et al., 2018), linear relaxations (Wong & Kolter, 2018; Mirman et al., 2018; Wang et al., 2018a; Dvijotham et al., 2018b; Weng et al., 2018; Zhang et al., 2018), interval bound propagation (Gowal et al., 2018)

, ReLU stability regularization 

(Xiao et al., 2019b), and semidefinite relaxations (Raghunathan et al., 2018a). Linear relaxations of neural networks, first proposed by Wong & Kolter (2018)

, is one of the most popular categories among these certified defences. They use the dual of linear programming or several similar approaches to provide a linear relaxation of the network (referred to as a “convex adversarial polytope”) and the resulting bounds are tractable for robust optimization. However, these methods are both computationally and memory intensive, and can increase model training time by a factor of hundreds. On the other hand, interval bound propagation (IBP) is a simple and efficient method that can also be used for training verifiable neural networks 

(Gowal et al., 2018)

, which achieved state-of-the-art verified error on many datasets despite its simplicity. However, since the IBP bounds are very loose during the initial phase of training, the training procedure can be unstable and sensitive to hyperparameters.

In this paper, we first investigate the limitation of existing linear relaxation based certified robust training methods and find that they over-penalize the induced norm of weight matrices, due to their nature of being linear. On the other hand, we interpret IBP as an augmented neural network, which learns to optimize a non-linear bound; the blessing of non-linearity gives IBP trained networks more powerful expressiveness. This explains the weakness of linear relaxation based methods and the success of IBP training on some tasks.

To improve the stability of IBP network, we propose a new certified robust training method, CROWN-IBP, which marries the expressive power and efficiency of IBP and a tight linear relaxation based verification bound, CROWN. CROWN-IBP bound propagation involves a forward pass and a backward pass, with computational cost significantly cheaper than purely linear relaxation based methods. In our experiments, we show that CROWN-IBP significantly improves the training stability, and further reduces verified errors by a noticeable margin comparing to the existing IBP based approach (Gowal et al., 2018). On MNIST, we reach and verified error under distortions with and , respectively, outperforming carefully tuned models in (Gowal et al., 2018), and significantly outperforming linear relaxation based methods (at  Wong et al. (2018) report over 40% verified error).

2 Related Work and Background

2.1 Robustness Verification and Relaxations of Neural Networks

Neural network robustness verification algorithms seek for upper and lower bounds of an output neuron for all possible inputs within a set

, typically a norm bounded perturbation. Most importantly, the margins of the outputs between the ground-truth class and any other classes determine model robustness. However, it has already been shown that finding the exact output range is a non-convex problem and NP-complete (Katz et al., 2017; Weng et al., 2018). Therefore, recent works resorted to giving relatively tight but computationally tractable bounds of the output range with necessary relaxations of the original problem. Many of these robustness verification approaches are based on linear relaxations of non-linear units in neural networks, including CROWN (Zhang et al., 2018), DeepPoly (Singh et al., 2019), Fast-Lin (Weng et al., 2018), DeepZ (Singh et al., 2018) and Neurify (Wang et al., 2018b). We refer the readers to (Salman et al., 2019) for a comprehensive survey on this topic. After linear relaxation, they essentially bound the output of a neural network

by linear upper/lower hyperplanes:

(1)

where is the product of the network weight matrices and diagonal matrices reflecting the ReLU relaxations; and are two bias terms unrelated to . Additionally, Dvijotham et al. (2018c, a); Qin et al. (2019) solve the Lagrangian dual of verification problem; Raghunathan et al. (2018a, b) propose semidefinite relaxations which are tighter compared to linear relaxation based methods, but computationally expensive. Bounds on neural network local Lipschitz constant can also be used for verification (Hein & Andriushchenko, 2017; Zhang et al., 2019). Besides these deterministic verification approaches, random smoothing can be used to increase the robustness of any model in a probabilistic manner (Cohen et al., 2019; Lecuyer et al., 2018; Li et al., 2018; Liu et al., 2018).

2.2 Robust Optimization and Verifiable Adversarial Defense

To improve the robustness of neural networks against adversarial perturbations, a natural idea is to generate adversarial examples by attacking itself and then use them to augment the training set (Kurakin et al., 2017). More recently, Madry et al. (2018) showed that adversarial training can be formulated as solving a min-max robust optimization problem as in (2). Given a model with parameter

, loss function

, and training data distribution , the training algorithm aims to minimize the robust loss, which is defined as the max loss within a neighborhood of each data point , leading to the following robust optimization problem:

(2)

To solve this minimax problem, Madry et al. (2018) proposed to use projected gradient descent (PGD) attack to approximately solve the inner max and then use the loss on the perturbed example to update the model. Networks trained by this procedure achieve state-of-the-art test accuracy under strong attacks (Athalye et al., 2018; Wang et al., 2018a; Zheng et al., 2018).

Despite being robust under strong attacks, models obtained by this PGD-based adversarial training do not have verified error guarantees. Due to the nonconvexity of neural networks, PGD attack can only compute the lower bound of robust loss (the inner maximization problem). Minimizing a lower bound of the inner max cannot guarantee (2) is minimized. In other words, even if PGD-attack cannot find a perturbation with large verified error, that does not mean there exists no such perturbation. This becomes problematic in safety-critical applications since those models need to be provably safe.

Verifiable adversarial training methods, on the other hand, aim to obtain a network with good robustness that can be verified. This can be done by combining adversarial training and robustness verification—instead of using PGD to find a lower bound of inner max, certified adversarial training uses a verification method to find an upper bound of the inner max, and then update the parameters based on this upper bound of robust loss. Minimizing an upper bound of the inner max guarantees to minimize the robust loss. There are two certified robust training methods that are related to our work and we will describe them in detail below.

2.2.1 Linear Relaxation Based Verifiable Adversarial Training

One of the most popular verifiable adversarial training method was proposed in (Wong & Kolter, 2018) using linear relaxations of neural networks to give an upper bound of the inner max. Other similar approaches include Mirman et al. (2018); Wang et al. (2018a); Dvijotham et al. (2018b). Since the bound propagation process of a convex adversarial polytope is too expensive, several methods were proposed to improve its efficiency, like Cauchy projection (Wong et al., 2018) and dynamic mixed training (Wang et al., 2018a). However, even with these speed-ups, the training process is still slow. Also, this method may significantly reduce the model’s standard accuracy (accuracy on natural, unmodified test set). As will be discussed in our experiments in Section 4, we show that this method tends to over-regularize the network during training. Intuitively, regularizing the linear relaxation of the network results in regularizing the norm of each row. Since they train the network to make this bound tight, an implicit regularization was added to the induced norm of weight matrices.

2.2.2 Interval Bound Propagation (IBP)

Interval Bound Propagation (IBP) uses a very simple rule to compute the pre-activation outer bounds for each layer of the neural network. Unlike linear relaxation based methods, IBP does not relax ReLU neurons and does not consider the correlations between different layer weights and treat each layer individually. Gowal et al. (2018) presented a verifiably robust training method by using IBP to give output bounds. The motivation of (Gowal et al., 2018) is to speed up the training process of verifiably robust models.

However, IBP can be unstable to use in practice, since the bounds can be very loose especially during the initial phase of training, posing a challenging problem to the optimizer. To help with instability, Gowal et al. (2018) use a mixture of regular and robust cross-entropy loss as the model’s training loss, controlled by a parameter ; it can be tricky to balance the two losses. Due to the difficulty in parameter tuning, reproducing the results in (Gowal et al., 2018) is believed to be hard222https://github.com/deepmind/interval-bound-propagation/issues/1. Achieving the best CIFAR results reported in (Gowal et al., 2018)

requires training for 3200 epochs with a batch size of 1600 on 32 TPUs.

3 Methodology

3.1 Background

We first give notations used throughout the paper, and backgrounds on verification and robust optimization.

Notation.

We define an -layer neural network recursively as:

where , represents input dimension and is the number of classes,

is an element-wise activation function. We use

to represent pre-activation neuron values and to represent post-activation neuron values. Consider an input example with ground-truth label , we consider a set of and we desire a robust network to have the property for all . We define element-wise upper and lower bounds for and as and .

Verification Specifications.

Neural network verification literature typically defines a specification vector

, that gives a linear combination for neural network output: . In robustness verification, typically we set where is the ground truth class label, where is the attack target label and other elements in are 0. This will represent the margin between class and class . For an

class classifier and a given label

, we define a specification matrix as:

(3)

Importantly, each element in vector gives us margins between class and all other classes. We define the lower bound of for all as , which is a very important quantity. Wong & Kolter (2018) showed that for cross-entropy loss we have:

(4)

(4) gives us the opportunity to solve the robust optimization problem (2) via minimizing the tractable upper bound of inner-max. Minimizing the upper bound guarantees that is also minimized.

Figure 1: Interval Bound Propagation viewed as training an augmented neural network (IBP-NN). The inputs of IBP-NN are two images and . The output of IBP-NN is a vector of lower bounds of margins (denoted as ) between ground-truth class and all classes (including the ground-truth class itself) for all . This vector

is negated and sent into a regular softmax function to get model prediction. The top-1 prediction of softmax is correct if and only if all margins between the ground-truth class and other classes (except the ground truth class) are positive, i.e., the model is verifiably robust. Thus, an IBP-NN with low standard error guarantees low verified error on the original network.

3.2 Analysis of IBP and Linear Relaxation based Verifiable Training Methods

Interval Bound Propagation (IBP)

Interval Bound Propagation (IBP) uses a very simple bound propagation rule. For input layer we have

(element-wise). For an affine layer we have

(5)
(6)

where takes element-wise absolute value. Note that and . And for element-wise monotonic increasing activation functions ,

(7)

Given a fixed NN, IBP gives a very loose estimation of the output range of

. However, during training, since the weights of NN can be updated, we can equivalently view IBP as an augmented neural network, referred to as IBP-NN (Figure 1). Unlike an usual network which takes an input with label , IBP-NN takes two points and as input (where , element-wise). The bound propagation process can be equivalently seen as forward propagation in a specially structured neural network, as shown in Figure 1. After the last specification layer (typically merged into ), we can obtain . Then,

is sent to softmax layer for prediction. Importantly, since

(due to the -th row in is always 0), the top-1 prediction of the augmented IBP network is if and only if all other elements of are positive, i.e., the original network will predict correctly for all .

Dataset ( norm) IBP verified error Convex-adv verified error
MNIST 0.1 5.83% 8.90%
0.2 7.37% 45.37%
0.3 10.68% 97.77%
0.4 16.76% 99.98%
Fashion-MNIST 0.1 23.49% 44.64%
CIFAR-10 2/255 58.75% 62.94%
8/255 73.34% 91.44%
Table 1: We train robust neural networks on 3 datasets using Pure-IBP. These models have low IBP verified errors but when they are evaluated with a typically much tighter bound (convex adversarial polytope by Wong et al. (2018)) the verified errors increase significantly, and sometimes the model becomes unverifiable. IBP is not always looser than linear relaxation based methods; it can sometimes be powerful. We use a small model in this experiment (see Appendix B). See Table 2 for full results on verified errors of our proposed method.

When we train the augmented IBP network with ordinary cross-entropy loss and desire it to predict correctly on an input , we are implicitly doing robust optimization (Eq. (2)). We attribute the success of IBP in (Gowal et al., 2018) to the power of non-linearity – instead of using linear relaxations of neural networks (Wong & Kolter, 2018) to obtain , we train a non-linear network that learns to gives us a good quality . Additionally, we also found that a network trained using IBP is not verifiable using linear relaxation based verification methods, including CROWN (Zhang et al., 2018), convex adversarial polytope (Wong & Kolter, 2018) and Fast-Lin (Weng et al., 2018). A purely IBP trained network has low IBP verified error but its verified error using convex adversarial polytope (Wong & Kolter, 2018) or Fast-Lin (Weng et al., 2018) can be much higher; sometimes the network becomes unverifiable using these typically tighter bounds (see Table 1). This also indicates that IBP is a non-linear mechanism different from linear relaxation based methods.

Figure 2: Verified error and 2nd CNN layer’s induced norm for a model trained using (Wong et al., 2018) and CROWN-IBP. is increased from 0 to 0.3 in 60 epochs.

However, IBP is a very loose bound during the initial phase of training, which makes training unstable; purely using IBP frequently leads to divergence. Gowal et al. (2018) proposed to use a schedule where is gradually increased during training, and a mixture of robust cross-entropy loss with regular cross-entropy loss as the objective to stablize training:

(8)

where starts with 1 and slowly decreases to .

Issues with linear relaxation based training.

Since IBP hugely outperforms linear relaxation based methods in the recent work (Gowal et al., 2018) on some datasets, we want to understand what is going wrong with linear relaxation based methods. We found that the models produced by linear relaxation methods such as (Wong & Kolter, 2018) and (Wong et al., 2018) are over-regularized especially at a larger . In Figure 2 we train a small 4-layer MNIST model and we increase from 0 to 0.3 in 60 epochs. We plot the induced norm of the 2nd CNN layer during the training process on models trained using our method and (Wong et al., 2018). We find that when becomes larger (roughly at , epoch 30), the norm of weight matrix using  (Wong et al., 2018) starts to decrease, indicating that the model is forced to learn small norm weights and thus its representation power is severally reduced. Meanwhile, the verified error also starts to ramp up. Our proposed IBP based training method, CROWN-IBP, does not have this issue; the norm of weight matrices keep increasing during the training process, and verifiable error does not significantly increase when reaches 0.3.

Another issue with current linear relaxation based training or verification methods, including convex adversarial polytope and CROWN (Zhang et al., 2018), is their high computational and memory cost, and poor scalability. For the small network in Figure 2, convex adversarial polytope (with 50 random Cauchy projections) is 8 times slower and takes 4 times more memory than CROWN-IBP (without using random projections). Convex adversarial polytope scales even worse for larger networks; see Appendix E for a comparison.

3.3 The proposed algorithm: CROWN-IBP

We have reviewed IBP and linear relaxation based methods above. IBP has great representation power due to its non-linearity, but can be tricky to tune due to its very imprecise bound at the beginning; on the other hand, linear relaxation based methods give tighter lower bounds which stabilize training, but it over-regularizes the network and forbids us to achieve good accuracy.

We propose CROWN-IBP, a new training method for certified defense where we optimize the following problem ( represents the network parameters):

(9)

where our lower bound of margin is a combination of two bounds with different natures: IBP, and a CROWN-style bound; is the cross-entropy loss. CROWN is a tight linear relaxation based lower bound which is more general and often tighter than convex adversarial polytope. Importantly, CROWN-IBP avoids the high computational cost of ordinary CROWN (or many other linear relaxation based methods, like convex adversarial polytope), by applying CROWN-style bound propagation on the final specifications only; intermediate layer bounds and are obtained by IBP. We start with and use the tight bounds to stabilize initial training. Then we ramp up from 0 to 1 while we increase from 0 to , until we reach the desired and . The network is trained using pure IBP at that point.

Benefits of CROWN-IBP.

First, we compute tight linear relaxation based bounds during the early phase of training, thus greatly improve the stability and reproducibility of IBP. Second, we do not have the over-regularization problem as the CROWN-style bound is gradually phased out during training. Third, unlike the approach used in (Gowal et al., 2018) that mixes regular cross-entropy loss with robust cross-entropy loss (Eq. (8)) to stabilize IBP, we use the mixture of two lower bounds, which is still a valid lower bound of ; thus, we are strictly within the robust optimization framework (Eq. (2)) and also obtain better empirical results. Forth, because we apply the CROWN-style bound propagation only to the last layer, the computational cost is greatly reduced comparing to other methods that purely relies on linear relaxation based bounds.

CROWN-IBP consists of IBP bound propagation in a forward pass and CROWN-style bound propagation in a backward pass. We discuss the details of CROWN-IBP below.

Forward Bound Propagation in CROWN-IBP.

In CROWN-IBP, we first obtain and for all layers by applying (5), (6) and (7). Then we will obtain (assuming is merged into ). Obtaining these bounds is similar to forward propagation in IBP-NN (Figure 1). The time complexity of IBP is comparable to two forward propagation passes of the original network.

Linear Relaxation of ReLU neurons

Given and computed in previous step, we first check if some neurons are always active () or always inactive (), since these neurons are effectively linear and no relaxations are needed. For the remaining unstable neurons, Zhang et al. (2018); Wong et al. (2018) give a linear relaxation for the special case of element-wise ReLU activation function:

(10)

where ; Zhang et al. (2018) proposes to adaptively select when and 0 otherwise, which minimizes the relaxation error. In other words, for an input vector , we effectively replace the ReLU layer with a linear layer, giving upper or lower bounds of the output:

(11)

where and are two diagonal matrices representing the “weights” of the relaxed ReLU layer. In the following we focus on conceptually presenting the algorithm, while more details of each term can be found in the Appendix.

Backward Bound Propagation in CROWN-IBP.

Unlike IBP, CROWN-style bounds start computation from the last layer, so we refer it as backward bound propagation (not to be confused with the back-propagation algorithm to obtain gradients). Suppose we want to obtain the lower bound (we assume the specification matrix has been merged into ). The input to layer is , which has already been replaced by Eq. (11). CROWN-style bounds choose the lower bound of (LHS of (11)) when is positive, and choose the upper bound when is negative. We then merge and the linearized ReLU layer together and define:

(12)

Now we have a lower bound where collects all terms not related to . Note that the diagonal matrix implicitly depends on . Then, we merge with the next linear layer, which is straight forward by plugging in :

Then we continue to unfold the next ReLU layer using its linear relaxations, and compute a new matrix, with in a similar manner as in (12). Along with the bound propagation process, we need to compute a series of matrices, , where , and . At this point, we merged all layers of the network into a linear layer: where collects all terms not related to . A lower bound for with can then be easily given as

(13)

For ReLU networks, convex adversarial polytope (Wong & Kolter, 2018) uses a very similar bound propagation procedure. CROWN-style bounds allow an adaptive selection of in (10), thus often gives slightly better bounds. We give details on each term in Appendix F.

Computational Cost of CROWN-IBP.

In ordinary CROWN (Zhang et al., 2018) and convex adversarial polytope (Wong & Kolter, 2018), the bound propagation process is much more expensive than CROWN-IBP, since it needs to use (13) to compute all intermediate layer’s and (), by considering as the final layer of the network. In this case, for each layer we need a different set of matrices, defined as . This causes three computational issues:

  1. Unlike the last layer , an intermediate layer has a much larger output dimension typically thus all will have large dimensions .

  2. Computation of all matrices is expensive. Suppose the network has neurons for all intermediate and input layers and neurons for the output layer (assuming ), the time complexity of ordinary CROWN or convex adversarial polytope is . A ordinary forward propagation only takes time per example, thus ordinary CROWN does not scale up to large networks for training, due to its quadratic dependency in and extra times overhead.

  3. When both and

    represent convolutional layers with small kernel tensors

    and , there are no efficient GPU operations to form the matrix using and . Existing implementations either unfold at least one of the convolutional kernels to fully connected weights, or use sparse matrices to represent and . They suffer from poor hardware efficiency on GPUs.

In CROWN-IBP, we do not have the first and second issues since we use IBP to obtain intermediate layer bounds, which is only slower than forward propagation by a constant factor. The time complexity of the backward bound propagation in CROWN-IBP is , only times slower than forward propagation and significantly more scalable than ordinary CROWN (which is times slower than forward propagation, where typically ). The third issue is also not a concern, since we start from the last specification layer which is a small fully connected layer. Suppose we need to compute and is a convolutional layer with kernel , we can efficiently compute on GPUs using the transposed convolution operator with kernel . Conceptually, the backward pass of CROWN-IBP propagates a small specification matrix backwards, replacing affine layers with their transposed operators, and activation function layers with a diagonal matrix product. This allows efficient implementation and better scalability.

4 Experiments

4.1 Setup and Models

To discourage hand-tuning on a small set of models, we use 20 different model architectures for a total of 53 models for MNIST, Fashion-MNIST and CIFAR-10 datasets, from small CNNFC models to wide CNNFC models. The models are a mixture of networks with different depths, widths and convolution kernel sizes. Details are presented in Appendix B. We consider robustness and report both the best and worst verified error achieved over all models, and the median of error. Verified errors are evaluated using IBP on the test set. The median error implies that at least half of models trained are as good as this error. We list hyperparameters in Appendix A. Since we focus on improving stability and performance of IBP models, in experiments we compare three different variants of IBP333Our implementations of all IBP variants used in this paper is available at https://github.com/huanzhang12/CROWN-IBP:

  • CROWN-IBP: our proposed method, using CROWN-style linear relaxations in a backward manner to improve IBP training stability, and a mixture of CROWN and IBP lower bounds in robust cross-entropy (CE) loss, as in Eq. (9). No regular CE loss is used.

  • Pure-IBP: using the lower bounds purely from IBP in robust CE loss. No regular CE loss is used. Equivalent to Eq. (9) with fixed to 1.

  • Natural-IBP-: proposed by (Gowal et al., 2018), using the lower bounds provided by IBP in robust CE loss (multiplied by ), plus times a regular CE loss term as in Eq. (8). This is initialized as 1 and is gradually reduced during training. In our experiments we choose two final values, 0.5 and 0. Note that Gowal et al. (2018) uses 0.5 as the final value.

max width=1.00 Dataset () Model Family Method Verified Test Error (%) Standard Test Error(%) Best results reported in literature (%) best median worst best median worst Source Verified Err. Std. Err. MNIST 0.11 10 small models Pure-IBP 4.79 5.74 7.32 1.48 1.59 2.5 Gowal et al. (2018) 2 Natural-IBP, final 4.87 5.72 7.24 1.51 1.34 2.46 Natural-IBP, final 5.24 5.95 7.36 1.41 1.88 1.87 Xiao et al. (2019b) CROWN-IBP 4.211 5.18 6.8 1.41 1.83 2.58 8 large models Pure-IBP 5.9 6.25 7.82 1.14 1.12 1.23 Wong et al. (2018) 3.67 1.08 Natural-IBP, final 5.77 6.3 7.5 1.21 1.13 1.34 Natural-IBP, final 6.05 6.4 7.7 1.19 1.33 1.24 Dvijotham et al. (2018b) 4.44 1.2 CROWN-IBP 5.221 5.63 6.34 1.19 1.05 1.03 Mirman et al. (2018) 3.4 1.0 0.21 10 small models Pure-IBP 6.9 8.24 12.67 1.93 2.76 4.14 Gowal et al. (2018) 2 Natural-IBP, final 6.84 8.16 12.92 2.01 2.56 3.93 Natural-IBP, final 7.31 8.71 13.54 1.62 2.36 3.22 CROWN-IBP 6.111 7.29 11.97 1.93 2.3 3.86 8 large models Pure-IBP 7.56 8.6 9.8 1.96 2.19 1.39 Xiao et al. (2019b) Natural-IBP, final 8.26 8.72 9.84 1.45 1.73 1.31 Natural-IBP, final 8.42 8.9 10.09 1.76 1.42 1.53 CROWN-IBP 6.061 6.42 7.64 1.09 1.33 1.36 0.3 10 small models Pure-IBP 10.54 12.02 20.47 2.78 3.31 6.07 Gowal et al. (2018) 8.052 1.66 Natural-IBP, final 9.96 12.09 21.0 2.7 3.48 6.68 Natural-IBP, final 10.37 12.78 21.99 2.11 3.44 5.19 Wong et al. (2018) 43.1 14.87 CROWN-IBP 8.87 11.29 16.83 2.43 3.62 7.26 8 large models Pure-IBP 10.43 10.83 11.99 2.01 2.38 3.29 Natural-IBP, final 10.74 11.73 12.16 2.17 2.46 1.6 Xiao et al. (2019b) 19.32 2.67 Natural-IBP, final 11.23 11.71 12.4 1.72 2.09 1.63 CROWN-IBP 7.46 8.47 8.57 1.48 1.52 1.99 0.4 10 small models Pure-IBP 16.72 18.89 37.42 4.2 5.4 9.63 Gowal et al. (2018) 14.88 1.66 Natural-IBP, final 16.1 18.75 35.3 3.8 4.93 11.32 Natural-IBP, final 16.54 19.14 35.42 3.4 3.65 7.54 CROWN-IBP 15.38 18.57 24.56 3.61 4.83 8.46 8 large models Pure-IBP 15.17 16.54 18.98 2.83 3.79 4.91 Natural-IBP, final 15.63 16.06 17.11 2.93 3.4 3.75 Natural-IBP, final 15.74 16.42 17.98 2.35 2.31 3.15 CROWN-IBP 12.96 13.43 14.25 2.76 2.85 3.36 Fasthion-MNIST 0.1 10 small models Pure-IBP 22.91 23.92 25.70 16.14 16.23 18.66 Wong & Kolter (2018) 34.53 21.73 Natural-IBP, final 22.34 23.72 25.43 15.10 16.22 17.65 Natural-IBP, final 23.44 24.62 26.21 13.20 14.58 15.02 CROWN-IBP 22.00 23.43 25.87 15.11 16.44 18.08 8 large models Pure-IBP 22.37 23.01 23.73 15.57 15.43 15.25 Natural-IBP, final 22.07 23.74 25.56 14.82 13.95 15.31 Natural-IBP, final 23.87 25.47 27.19 12.72 13.62 13.07 CROWN-IBP 21.27 21.71 22.43 14.19 14.70 15.41 CIFAR-10 2/255 9 small models Pure-IBP 54.69 57.84 60.58 40.59 45.51 51.38 Gowal et al. (2018) 49.98 29.84 Natural-IBP, final 54.56 58.42 60.69 40.32 47.42 50.73 Natural-IBP, final 56.89 60.66 63.58 34.28 39.28 48.03 Wong et al. (2018) 46.11 31.72 CROWN-IBP 52.46 57.55 60.67 39.11 46.76 50.86 8 large models Pure-IBP 55.47 56.41 58.54 41.59 44.33 46.54 Xiao et al. (2019b) 54.07 38.88 Natural-IBP, final 55.51 56.74 57.85 42.41 43.71 44.74 Natural-IBP, final 57.05 59.7 60.25 34.77 35.8 38.95 Mirman et al. (2018) 47.8 38.0 CROWN-IBP 52.52 53.9 56.05 39.34 40.07 43.57 8/255 9 small models Pure-IBP 72.07 73.34 73.88 61.11 61.01 64.0 Gowal et al. (2018) 67.963 50.51 Natural-IBP, final 72.42 72.57 73.49 62.26 60.98 63.5 Natural-IBP, final 73.88 75.16 76.93 55.66 52.53 53.79 Xiao et al. (2019b) 79.73 59.55 CROWN-IBP 71.28 72.15 73.66 59.07 59.18 63.17 8 large models Pure-IBP 72.75 73.23 73.82 59.23 65.96 66.35 Wong et al. (2018) 78.22 71.33 Natural-IBP, final 72.18 72.83 74.38 62.54 59.6 61.99 Natural-IBP, final 74.84 75.59 97.93 51.71 54.41 54.12 Dvijotham et al. (2018b) 73.33 51.36 CROWN-IBP 70.79 71.17 72.29 57.82 58.68 59.73

  • For small IBP based methods tend to overfit on training set, thus we observe suboptimal verified errors on this table as we do not fine-tune hyperparameter for each individual setting. By adding explicit regularization, CROWN IBP can achieve 3.60% verified error at and 5.48% at (see Section 4.4). Additionally, early stopping can also help achieve a better verified error; with early stopping the best verified errors achieved by CROWN-IBP are 3.55% at and 4.98% for . See Figure 3.

  • These models reported in Gowal et al. (2018) are trained at a larger than the used for evaluation. We always use the same for training and evaluation. Gowal et al. (2018) trained models with and evaluate them with ; they also trained models with and evaluate them with and .

  • Practically reproducible verified error is about 71% - 72%, matching our reported numbers for Natural-IBP. See https://github.com/deepmind/interval-bound-propagation/issues/1#issuecomment-492552237

Table 2: The verified and standard (clean) test errors for models trained on MNIST, Fashion-MNIST and CIFAR using different methods. Here we train 53 models using CROWN-IBP, pure-IBP and natural-IBP (with final 0 or 0.5). For each scenario, we pick 3 representative models among all models: the models with smallest, median, and largest verified error. We also report the standard error of these three selected models.

4.2 Comparisons to IBP based methods

In Table 2 we show the verified errors on test sets for CROWN-IBP and the other two IBP baselines. We also include best verified errors reported in literature for comparison. Numbers reported in Gowal et al. (2018) use the same training method as Natural IBP with final , albeit they use different hyperparameters and sometimes different for training and evaluation; we always use the same value for training and evaluation. CROWN-IBP’s best, median, and worst test verified errors are consistently better than all other IBP-based baselines across all models and ’s. Especially, on MNIST with and we achieve 7.46% and 12.96% best verified error, respectively, outperforming all previous works and significantly better than convex relaxation based training methods (Wong et al., 2018); a similar level of advantage can also be observed on Fashion-MNIST (Wong & Kolter, 2018). For small on MNIST, we find that IBP based methods tend to overfit. For example, adding an regularization term can decreases verified error at from 5.63% to 3.60% (see Section 4.4 for more details); this explains the performance gap in Table 2 at small between CROWN-IBP and convex adversarial polytope, since the latter method provides implicit regularization. On CIFAR-10 with , CROWN-IBP is better than all other methods except (Gowal et al., 2018); however, Gowal et al. (2018) obtained the best result by using a large network trained for 3200 epochs with a fine-tuned schedule on 32 TPUs; practically, the reproduciable verified error by Gowal et al. (2018) is around 71% - 72% (see notes under table). In contrast, our results can be obtained in reasonable time using a single RTX 2080 Ti GPU. We include training time comparisons in Appendix E.

4.3 Training stability

To evaluate training stability, we compare the verified errors obtained by training processes under different schedule length (10, 15, 30, 60). We compare the best, worst and median verified errors over all 18 models for MNIST. Our results are presented in Figure 3 (for 8 large models) and Figure 4 (for 10 small models) at . The upper and lower bound of an error bar are the worst and best verified error, respectively, and the lines go through median values. We can see that both Natural-IBP and CROWN-IBP can improve training stability when the schedule length is not sufficient (10, 20 epochs). When schedule length is above 30, CROWN-IBP’s verified errors are consistently better than any other method. Pure-IBP cannot stably converge on all models when schedule is short, espeically for a larger . We conduct additional training stability experiments on CIFAR-10 dataset and the observations are similar (see Appendix D).

Another interesting observation is that at a small , a shorter schedule improves results for large models (Figure 3). This is due to early stopping which controls overfitting (see Section 4.4). Since we decrease the learning rate by half every 10 epochs after the schedule ends, a shorter schedule implies that the learning process stops earlier.

To further test the training stability of CROWN-IBP, we run each MNIST experiment (in Table 2

) 5 times on 10 small models. The mean and standard deviation of the verified and standard errors on test set are presented in Appendix 

C. Standard deviations of verified errors are very small, giving us further evidence of good stability.

(a) , best
(b) , best
(c) , best
Figure 3: Verified error vs. schedule length (10, 20, 30, 60) on 8 large MNIST models. The upper and lower ends of a vertical bar represent the worst and best verified error, respectively. The dotted lines go through median values. For a small , using a shorter schedule length improves verified error due to early stopping, which prevents overfitting. All best verified errors are achieved by CROWN-IBP regardless of schedule length.
(a) , best
(b) , best
(c) , best
Figure 4: Verified error vs. schedule length (10, 20, 30, 60) on 10 small MNIST models. The upper and lower ends of a vertical bar represent the worst and best verified error, respectively. All best verified errors are achieved by CROWN-IBP regardless of schedule length. The dotted lines go through median values.

4.4 Overfitting issue with small

We found that on MNIST for a small , the verified error obtained by IBP based methods are not as good as linear relaxation based methods (Wong et al., 2018; Mirman et al., 2018). Gowal et al. (2018) thus propose to train models using a larger and evaluate them under a smaller , for example and . Instead, we investigated this issue further and found that many CROWN-IBP trained models achieve very small verified errors (close to 0 and sometimes exactly 0) on training set (see Table 3). This indicates possible overfitting during training. As we discussed in Section 3.1, linear relaxation based methods implicitly regularize the weight matrices so the network does not overfit when is small. Inspired by this finding, we want to see if adding an explicit regularization term in CROWN-IBP training helps when or . The verified and standard errors on the training and test sets with and without regularization can be found in Table 3. We can see that with a small regularization added () we can reduce verified error on test set significantly. This makes CROWN-IBP results comparable to the numbers reported in convex adversarial polytope (Wong et al., 2018); at , the best model using convex adversarial polytope training can achieve certified error, while CROWN-IBP achieves best certified error on the models presented in Table 3. The overfitting is likely caused by IBP’s strong representation power, which also explains why IBP based methods significantly outperform linear relaxation based methods at larger values. Using early stopping can also improve verified error on test set; see Section 4.3.

Model Name : regularization Training Test
(see Appendix B) standard error verified error standard error verified error
0.1 P 0 0.01% 0.01% 1.05% 5.63%
P 0.32% 0.98% 1.30% 3.60%
O 0 0.02% 0.05% 0.82% 6.02%
O 0.38% 1.34% 1.43% 4.02%
0.2 P 0 0.35% 1.40% 1.09% 6.06%
P 1.02% 3.73% 1.48% 5.48%
O 0 0.31% 1.54% 1.22% 6.64%
O 1.09% 4.08% 1.69% 5.72%
Table 3: regularized and unregularized models’ standard and verified errors on training and test set. At a small , CROWN-IBP may overfit and adding regularization helps robust generalization; on the other hand, convex relaxation based methods (Wong et al., 2018) provides implicitly regularization which helps generalization under small but deteriorate model performance at larger .

5 Conclusions

In this paper, we propose a new certified defense method, CROWN-IBP, by combining the fast interval bound propagation (IBP) in the forward pass and a tight linear relaxation based bound, CROWN, in the backward pass. Our method enjoys the non-linear representation power and high computational efficiency provided by IBP while facilitating the tight CROWN bound to stabilize training strictly under the robust optimization framework. Our experiments on a variety of model structures and three datasets show that CROWN-IBP consistently outperforms other IBP baselines and achieves state-of-the-art verified errors.

References

Appendix A Hyperparameters in Experiments

All our MNIST, CIFAR-10 models are trained on a single NVIDIA 2080 Ti GPU. In all our experiments, if not mentioned otherwise, we use the following hyperparameters:

  • For MNIST, we train 100 epochs with batch size 256. We use Adam optimizer and the learning rate is . The first epoch is standard training for warming up. We gradually increase linearly per batch in our training process with a schedule length of 60. We reduce the learning rate by 50% every 10 epochs after schedule ends. No data augmentation technique is used and the whole 28 28 images are used (normalized to 0 - 1 range).

  • For CIFAR, we train 200 epoch with batch size 128. We use Adam optimizer and the learning rate is 0.1%. The first 10 epochs are standard training for warming up. We gradually increase linearly per batch in our training process with a schedule length of 120. We reduce the learning rate by 50% every 10 epochs after schedule ends. We use random horizontal flips and random crops as data augmentation. The three channels are normalized with mean (0.4914, 0.4822, 0.4465) and standard deviation (0.2023, 0.1914, 0.2010). These numbers are per-channel statistics from the training set used in (Gowal et al., 2018).

For all experiments, we set when the schedule starts. We decrease linearly to 0 when finishes its increasing schedule and reaches . We did not tune the schedule for parameter , and it always has the same schedule length as the schedule. All verified error numbers are evaluated on the test set, using IBP, since the networks are trained using pure IBP after reaches the target. We found that CROWN (Zhang et al., 2018) or Fast-Lin (Weng et al., 2018) cannot give tight verification bounds on IBP trained models (some comparison results are given in Table 1).

Appendix B Model Structure


max width= Name Model Structure (all models have a last FC 10 layer, which are omitted) A (MNIST Only) Conv 4 +2, Conv 8 +2, FC 128 B Conv 8 +2, Conv 16 +2, FC 256 C Conv 4 +1, Conv 8 +1, Conv 8 +4, FC 64 D Conv 8 +1, Conv 16 +1, Conv 16 +4, FC 128 E Conv 4 +1, Conv 8 +1, Conv 8 +4, FC 64 F Conv 8 +1, Conv 16 +1, Conv 16 +4, FC 128 G Conv 4 +1, Conv 4 +2, Conv 8 +1, Conv 8 +2, FC 256, FC 256 H Conv 8 +1, Conv 8 +2, Conv 16 +1, Conv 16 +2, FC 256, FC 256 I Conv 4 +1, Conv 4 +2, Conv 8 +1, Conv 8 +2, FC 512, FC 512 J Conv 8 +1, Conv 8 +2, Conv 16 +1, Conv 16 +2, FC 512, FC 512 K Conv 16 +1, Conv 16 +2, Conv 32 +1, Conv 32 +2, FC 256, FC 256 L Conv 16 +1, Conv 16 +2, Conv 32 +1, Conv 32 +2, FC 512, FC 512 M Conv 32 +1, Conv 32 +2, Conv 64 +1, Conv 64 +2, FC 512, FC 512 N Conv 64 +1, Conv 64 +2, Conv 128 +1, Conv 128 +2, FC 512, FC 512 O(MNIST Only) Conv 64 +1, Conv 128 +1, Conv 128 +4, FC 512 P(MNIST Only) Conv 32 +1, Conv 64 +1, Conv 64 +4, FC 512 Q Conv 16 +1, Conv 32 +1, Conv 32 +4, FC 512 R Conv 32 +1, Conv 64 +1, Conv 64 +4, FC 512 S(CIFAR Only) Conv 32 +2, Conv 64 +2, FC 128 T(CIFAR Only) Conv 64 +2, Conv 128 +2, FC 256

Table 4: Model structures used in all of our experiments. We use ReLU activations for all models. To save space, we omit the last fully connected layer as its output dimension is always 10. In the table, “Conv ” represents to a 2D convolutional layer with filters of size

and a stride of

.

Table 4 gives the 18 model structures used in our paper. MNIST and Fashion-MNIST use exactly the same model structures. Most CIFAR models share the same structures as MNIST models (unless noted on the table) except that their input dimensions are different. Model A is too small for CIFAR thus we remove it for CIFAR experiments. Models A - J are the “small models” reported in Table 2. Models K - T are the “large models” reported in Table 2. For results in Table 1, we use a small model (model structure B) for all three datasets.

Appendix C Reproducibility

error model A model B model C model D model E model F model G model H model I model J
0.1 std. err. (%)
verified err. (%)
0.2 std. err. (%)
verified err. (%)
0.3 std. err. (%)
verified err. (%)
0.4 std. err. (%)
verified err. (%)
Table 5: Mean and standard deviation of different CROWN-IBP models’ verified and standard error rates on MNIST test set. The architectures of the models are presented in Table 4. We run each model 5 times to compute the mean and standard deviation.

We run CROWN-IBP on 10 small MNIST models, 5 times each, and report the mean and standard deviation of standard and verified errors in Table 5. We can observe that the results from multiple runs are very similar with small standard deviations, so reproducibility is not an issue for CROWN-IBP.

Appendix D Training Stability Experiments on CIFAR

Similar to our experiments in Section 4.3, we compare the verified errors obtained by CROWN-IBP, Natural-IBP and Pure-IBP under different schedule lengths (30, 90, 120). We present the best, worst and median verified errors over all 17 models for CIFAR-10 in Figure 5 and  6, at . The upper and lower bound of an error bar are the worst and best verified error, respectively, and the lines go through median values. CROWN-IBP can improve training stability, and consistently outperform other methods. Pure-IBP cannot stably converge on all models when schedule is short, and verified errors tend to be higher. Natural-IBP is sensitive to setting; with many models have high robust errors (as shown in Figure 5(b)).

(a) , best
(b) , best
Figure 5: Verified error vs. schedule length (30, 90, 120) on 9 small CIFAR models. The upper and lower ends of a vertical bar represent the worst and best verified error, respectively. The dotted lines go through median values.
(a) , best
(b) , best
Figure 6: Verified error vs. schedule length (30, 90, 120) on 8 large CIFAR models. The upper and lower bound of an error bar are worst and best verified error, respectively. The dotted lines go through median values.

Appendix E Training Time

In Table 6 we present the training time of CROWN-IBP, Pure-IBP and convex adversarial polytope (Wong et al., 2018) on several representative models. All experiments are measured on a single RTX 2080 Ti GPU with 11 GB RAM. We can observe that CROWN-IBP is practically 2 to 7 times slower than Pure-IBP (theoretically, CROWN-IBP is up to times slower than Pure-IBP); convex adversarial polytope (Wong et al., 2018), as a representative linear relaxation based method, can be over hundreds times slower than Pure-IBP especially on deeper networks. Note that we use 50 random Cauchy projections for (Wong et al., 2018). Using random projections alone is not sufficient to scale purely linear relaxation based methods to larger datasets, thus we advocate a combination of non-linear IBP bounds with linear relaxation based methods as in CROWN-IBP, which offers good scalability, stability and representation power. We also note that the random projection based acceleration can also be applied to the backward bound propagation (CROWN-style bound) in CROWN-IBP to further speed CROWN-IBP up.

Data MNIST CIFAR
Model Name A C G L O B D H S M
Pure-IBP (s) 245 264 290 364 1032 734 908 1048 691 1407
CROWN-IBP (s) 423 851 748 1526 7005 1473 3351 2962 1989 6689
Convex adv (Wong et al., 2018) (s) 1708 9263 12649 35518 160794 2372 12688 18691 6961 51145
Table 6: Pure-IBP and CROWN-IBP’s training time on different models in seconds. For Pure-IBP and CROWN-IBP, we use a batchsize of 256 for MNIST and 128 for CIFAR. For convex adversarial polytope, we use 50 random Cauchy projections, and reduce batch size if necessary to fit into GPU memory.

Appendix F Exact Forms of the CROWN-IBP Backward Bound

CROWN (Zhang et al., 2018) is a general framework that replaces non-linear functions in a neural network with linear upper and lower hyperplanes with respect to pre-activation variables, such that the entire neural network function can be bounded by a linear upper hyperplane and linear lower hyperplane for all ( is typically a norm bounded ball, or a box region):

CROWN achieves such linear bounds by replacing non-linear functions with linear bounds, and utilizing the fact that the linear combinations of linear bounds are still linear, thus these linear bounds can propagate through layers. Suppose we have a non-linear vector function , applying to an input (pre-activation) vector , CROWN requires the following bounds in a general form:

In general the specific bounds for different needs to be given in a case-by-case basis, depending on the characteristics of and the preactivation range . In neural network common can be ReLU, , sigmoid, maxpool, etc. Convex adversarial polytope (Wong et al., 2018) is also a linear relaxation based techniques that is closely related to CROWN, but only for ReLU layers. For ReLU such bounds are simple, where are diagonal matrices, :

(14)

where and are two diagonal matrices:

(15)
(16)
(17)

Note that CROWN-style bounds require to know all pre-activation bounds and . We assume these bounds are valid for . In CROWN-IBP, these bounds are obtained by interval bound propagation (IBP). With pre-activation bounds and given (for ), we rewrite the CROWN lower bound for the special case of ReLU neurons:

Theorem F.1 (CROWN Lower Bound).

For a -layer neural network function , , we have , where