1 Introduction
Adversarial example is a crafted input for deceiving deep neural networks
[10]. It is a potentially critical safety issues in machine learning based autonomous systems like selfdriving cars. How can we make machine learning provably robust against adversarial examples?One of simple defense approach against adversarial examples is to mask gradients [9]. However, it provides a false sense of security [2] [1] [11]. Adversarial training [4] [5] [6], which injects adversarial examples with correct labels into training samples, is a promising approach, but it is just a countermeasure against collectable samples.
Recently, several certified robust learning approaches have been proposed [13] [7], but all of them lack scalability. Thus, in a practical time, those method can only achieve small degree of robustness even for low dimensional datasets. Lipschitz margin training (LMT) [12] is a scalable certified defense, but it can also only achieve small degree of robustness due to overregularization. How can we make certified defense more efficiently?
We present LCLMT, a lightweight Lipschitz margin training which solves the above problem. The contributions of the proposed method are: (a) efficient: it can achieve robustness at early epoch, and (b) robust: it has a potential to get higher robustness than LMT. In the evaluation, we demonstrate the benefits of the proposed method. LCLMT can achieve required robustness more than 30 epoch earlier than LMT in MNIST, and shows more than 90 accuracy against both legitimate and adversarial inputs.
2 Preliminary
Here, we introduce essential notations, definitions and existing work to understand our proposal.
2.1 robustness
robustness is a certifiable metric representing robustness of a neural network against adversarial examples.
Definition 1
(robustness) Let denote the ball of radius around a point . A neural network is called robust around a point if assigns the same class to all points .
This paper considers robustness in norm () as well as [12].
2.2 Lipschitz Margin Training
Lipschitz Margin Training (LMT) is a scalable way to ensure robustness based on the Lipschitz constant [12]. Note that, we assume the last layer of is softmax, and
represents logits which is output of the subnetwork before the softmax.
For all and , a neural network is Lipschitz continuous if there exists a real constant such that
(1) 
Lipschitz constant represents sensitivity of if the input is changed at most .
LMT also introduced a remarkable notation of prediction margin.
Definition 2
(Prediction Margin) Let be an input image and be the true label of , the prediction margin of in is computed as:
(2) 
LMT enlarges the prediction margin around each input to be robustness. The prediction margin to satisfy robustness for is derived in [12] as follows:
(3) 
Therefore, we need at least between true class and the others to make around each robust. In the training phase, LMT inflates each except . The algorithm of LMT is described in Algorithm 1.
The above inflation has an effect like regularization towards obtaining the margin to be robust. LMT can achieve robustness in a scalable way through this regularized training. However, we found that it has an issue like overregularization in case either and is large. Due to the issue, LMT tends to slowly forward training. Thus, it consumes lots of epochs to get the required margin and may cause underfitting. Actually, LMT is hard to achieve robustness with even in MNIST.
3 Proposed Method
We propose lowcost Lipschitz margin training (LCLMT) to solve the above problem. We introduce a limited inflation to make network robust more efficiently.
Let be the class whose score is the highest in all where . If the gap is larger than , all gaps between and satisfy (3).
Based on the above idea, we freight the inflation value only at . In the training phase, the inflation has an effect of increasing the score to be robust around (Algorithm 2).
Our proposed method LCLMT only inflates the score , but LMT inflates all except . Then, the rank of in can be changed at most 1 in LCLMT, at most number of classes  1 in LMT. Since softmax activation tends to depress lower rank values into zero, the chance of the depression of in LMT is higher than LCLMT. Thus, in terms of regularization, the regularization cost of LCLMT is thought of as smaller than LMT. The volume of the cost may be related with sum of inflated values and number of inflated classes. The theoretical analysis about the above discussion is a future work.
4 Evaluation
In this section we demonstrate the effectiveness of the proposed method LCLMT. The experiments were designed to answer following questions:

Efficiency: How early is our method in achieving robustness?

Robustness
: How accurate is our method in classifying adversarial inputs?
Experimental setting. We used MNIST and SVHN [8]
for the evaluations. For MNIST, we used a neural network having 4 fullyconnected layer with ReLU activation and softmax at output layer. Each hidden layer has 100 parameters. For SVHN, we used WideResnet
[14] with 16 layers and width factor 4 following [3]. Computation of Lipschitz constant follows LMT. To generate adversarial examples, we employed (L2) Carlini Wagner attack (CW attack) [2] with 100 iterations.4.1 Efficiency
We now evaluate efficiency of proposed method. Here, we employed MNIST and .
First, we show how early is our method in enlarging margin which satisfies (3). Figure (a)a shows the enlarged margin and required bound in average for each method. LCLMT has been satisfied the required bound since 5th epoch. While, LMT satisfied it around 35th epoch. Thus, our proposed method LCLMT can satisfy (3) at very early epoch.
Next, we show how early is our method in reducing loss. Figure (b)b shows loss value at each epoch. LCLMT shows dramatical loss drops at very early epoch similar to nonrobust learning. However, LMT cannot reduce loss in first 30 epoch. Since the difference between both methods are number of inflations on logits, the size of overall inflations might be a key to be efficient.
The above two results suggest that our proposed method provide efficient robust training.
4.2 Robustness
LMT (0.2)  LMT (1.0)  LCLMT (0.2)  LCLMT (1.0)  nonrobust  

Accuracy (Train)  0.500  0.335  0.933  0.911  0.984 
Accuracy (Test)  0.785  0.700  0.936  0.912  0.975 
Next, we measure robustness. Figures (a)a and (b)b plot classification accuracy against adversarial examples along with norm of perturbations. We run LMT and LCLMT in 100 epochs.
Figure (a)a shows accuracy against adversarial examples for MNIST. LCLMT shows high accuracy as well as nonrobust learning against very small perturbed inputs whose norm of perturbations is less than 1. LMT (0.2), which is LMT at , also shows good accuracy, but LMT (0.5) cannot correctly classify at all. For large perturbed inputs, LCLMT shows good accuracy in proportional to . However, LCLMT (2.0) and LCLMT (10.0) do not satisfy their robustness demands. Those demands might be too large to build classifiers with using MNIST. Anyway, those results suggest that LCLMT has a potential to make machine learning more robust than LMT. LCLMT still works at for MNIST.
On SVHN, Figure (b)b shows that LCLMT get higher accuracy than LMT until norm of perturbations < 0.7, but LMT is slightly better after 0.7. Table 1 shows training accuracy and test accuracy in each setting. The table suggests that LCLMT shows accuracies close to both training data and test data, but LMT is neither of them. The above results demonstrate that our proposed method LCLMT is accurate for both legitimate and adversarial inputs.
5 Conclusion
We proposed LCLMT, a low cost Lipschitz margin training. Our method has the following advantages; (a) efficient: it can achieve robustness at early epoch, and (b) robust: it has a potential to get higher robustness than LMT. Evaluation showed that LCLMT achieved required robustness in very early epochs, and it demonstrated more than 90 accuracy against both legitimate and adversarial inputs. In the future works, we tackle theoretical analysis of the proposed method and the relationships between LMT and the other Lipschitz based works.
References
 [1] A. Athalye, N. Carlini, and D. Wagner. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International Conference on Machine Learning, 2018.
 [2] N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy, 2017.
 [3] M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin, and N. Usunier. Parseval networks: Improving robustness to adversarial examples. In International Conference on Machine Learning, pages 854–863, 2017.
 [4] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2015.
 [5] A. Kurakin, I. Goodfellow, and S. Bengio. Adversarial machine learning at scale. In International Conference on Learning Representations, 2017.

[6]
A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu.
Towards deep learning models resistant to adversarial attacks.
In International Conference on Learning Representations, 2018.  [7] M. Mirman, T. Gehr, and M. Vechev. Differentiable abstract interpretation for provably robust neural networks. In International Conference on Machine Learning, pages 3575–3583, 2018.
 [8] Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng. Reading digits in natural images with unsupervised feature learning. In NIPS workshop on deep learning and unsupervised feature learning, volume 2011, page 5, 2011.
 [9] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami. Distillation as a defense to adversarial perturbations against deep neural networks. In IEEE Symposium on Security and Privacy, 2016.
 [10] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations, 2014.
 [11] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel. Ensemble adversarial training: Attacks and defenses. In International Conference on Learning Representations, 2018.
 [12] Y. Tsuzuku, I. Sato, and M. Sugiyama. Lipschitzmargin training: Scalable certification of perturbation invariance for deep neural networks. In Advances in Neural Information Processing Systems, 2018.
 [13] E. Wong and Z. Kolter. Provable defenses against adversarial examples via the convex outer adversarial polytope. In International Conference on Machine Learning, pages 5283–5292, 2018.
 [14] S. Zagoruyko and N. Komodakis. Wide residual networks. In British Machine Vision Conference, pages 87.1–87.12, 2016.
Comments
There are no comments yet.