# Evaluating the Robustness of Nearest Neighbor Classifiers: A Primal-Dual Perspective

We study the problem of computing the minimum adversarial perturbation of the Nearest Neighbor (NN) classifiers. Previous attempts either conduct attacks on continuous approximations of NN models or search for the perturbation by some heuristic methods. In this paper, we propose the first algorithm that is able to compute the minimum adversarial perturbation. The main idea is to formulate the problem as a list of convex quadratic programming (QP) problems that can be efficiently solved by the proposed algorithms for 1-NN models. Furthermore, we show that dual solutions for these QP problems could give us a valid lower bound of the adversarial perturbation that can be used for formal robustness verification, giving us a nice view of attack/verification for NN models. For K-NN models with larger K, we show that the same formulation can help us efficiently compute the upper and lower bounds of the minimum adversarial perturbation, which can be used for attack and verification.

## Authors

• 53 publications
• 17 publications
• 36 publications
• 57 publications
• 85 publications
• ### Defending Against Adversarial Examples with K-Nearest Neighbor

Robustness is an increasingly important property of machine learning mod...
06/23/2019 ∙ by Chawin Sitawarin, et al. ∙ 1

• ### K-Nearest Neighbour algorithm coupled with logistic regression in medical case-based reasoning systems. Application to prediction of access to the renal transplant waiting list

Introduction. Case Based Reasoning (CBR) is an emerg- ing decision makin...
03/07/2013 ∙ by Boris Campillo-Gimenez, et al. ∙ 0

• ### Evolving TSP heuristics using Multi Expression Programming

Multi Expression Programming (MEP) is an evolutionary technique that may...
09/08/2015 ∙ by Mihai Oltean, et al. ∙ 0

• ### A True O(n n) Algorithm for the All-k-Nearest-Neighbors Problem

In this paper we examined an algorithm for the All-k-Nearest-Neighbor pr...
08/01/2019 ∙ by Hengzhao Ma, et al. ∙ 0

• ### Adversarial Examples for k-Nearest Neighbor Classifiers Based on Higher-Order Voronoi Diagrams

Adversarial examples are a widely studied phenomenon in machine learning...
11/19/2020 ∙ by Chawin Sitawarin, et al. ∙ 24

A key problem in research on adversarial examples is that vulnerability ...
04/21/2018 ∙ by Ian Goodfellow, et al. ∙ 0

• ### Note: An alternative proof of the vulnerability of k-NN classifiers in high intrinsic dimensionality regions

This document proposes an alternative proof of the result contained in a...
10/02/2020 ∙ by Teddy Furon, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Adversarial robustness of neural networks has been extensively studied in the past few years. Given a data point, adversarial attacks are developed to construct small imperceptible input perturbations to alter the predicted label

[26, 13, 5, 2, 4]. On the other hand, robustness verification algorithms are also developed to compute a “safe region” around the point such that the prediction is provably unchanged within such region [31, 29, 34, 11]. An attack algorithm can be viewed as finding an upper bound of the “minimum adversarial perturbation” while a verification algorithm finds a lower bound of this value. In fact, robustness verification is often more important than attacks, since a verifiable behavior is required for mission-critical systems. For neural network models, due to non-convexity, both attack and verification cannot reach the minimum adversarial perturbation, and there is still a huge gap between the (computable) upper and lower bounds [24].

We study the problem of evaluating the robustness of the Nearest Neighbor (NN) classifiers. As a non-continuous step function, NN classifiers are very different from neural networks, and the neural network attack and verification methods cannot be directly applied to them. Previous attempts on attacking nearest neighbor models either use some simple heuristics [25] or apply gradient-based attacks to some continuous substitute models of NN [25, 22, 10]. Unfortunately, these attacks are far from optimal and do not have any theoretical guarantee. To the best of our knowledge, there is no existing approach on computing the minimum adversarial perturbation that can change an NN classifier’s output, and there is no existing verification method that can compute a meaningful lower bound of the safe region.

In this paper, we first study the -NN classifier and show that finding the minimum adversarial perturbation can be formulated as a set of convex quadratic programming (QP) problems, and a solution can be computed in polynomial time [16]. This is quite different from neural networks or tree-based models where finding the minimum perturbation has shown to be NP-hard [15, 14]. Furthermore, our formulation provides a very clean view of attack and verification for nearest neighbor classifiers. An attacker could solve any QP problem and any feasible solution will be a successful attack; a verifier could solve the dual of these problems and any feasible solution set will lead to a guaranteed lower bound of the minimum adversarial perturbation. Moreover, the primal minimum and dual maximum will match at the value of the minimum adversarial perturbation. We show that the QP problems can be solved efficiently by greedy coordinate ascent, and based on this primal-dual perspective, we further provide several screening rules to speed up the quadratic solvers.

When extending to -NN models with , our QP formulation will have the number of constraints growing exponentially with . However, we can still approximately solve the primal problems, and that will give an attack algorithm outperforming previous works. Furthermore, we propose a way to set dual feasible solutions to provide a tight lower bound of the minimum adversarial perturbation without solving any problem. This leads to an efficient -NN verification algorithm that works for any .

We conduct experiments on real datasets and have the following interesting findings:

• For -NN models, our proposed algorithm can efficiently compute the minimum adversarial perturbation. Our algorithm is provably optimal, achieves much smaller value, and is more efficient than previous attack methods. Also, this is the first robustness verification method for NN models.

• For -NN models with larger , computing the exact minimum adversarial perturbation is still challenging, but our formulation provides an efficient attack algorithm, which outperforms previous attack methods. More importantly, our dual problems lead to an efficient verification algorithm to compute the lower bound of adversarial perturbation and have time complexity independent to . Experiments show that the bounds are reasonably tight.

• Equipped with our algorithm, we accurately compute the robust error bound of the 1-NN model on MNIST and Fashion-MNIST. We find that a simple 1-NN model can achieve better robust error than CNN on these data.

## 2 Related work

#### Adversarial robustness of neural networks.

Adversarial robustness of neural networks has been studied extensively in the past few years. To evaluate the robustness of neural networks, attack algorithms are developed to find adversarial examples that are close to the original example [5, 13, 18, 7, 8]. However, due to the non-convexity of neural networks, these attacks cannot reach the minimum perturbation so they can only provide some upper bound of robustness and cannot provide any robustness guarantee. For safety-critical applications such as real-world control systems, it is essential to have robustness guarantees such that we know the prediction is provably unchanged within a certain distance. This motivates recent studies on neural network verification which aims to compute a lower bound of the minimum adversarial perturbation [30, 31, 29, 12, 27, 34, 35]. Also, many of these robustness verification bounds can be incorporated in the training procedure to obtain “verifiable” networks [31, 32, 19].

#### Adversarial robustness of nearest neighbor classifiers.

Adversarial robustness of nearest neighbor classifiers is less studied. Unfortunately, the algorithms mentioned above designed for neural networks cannot be directly applied to NN models since NN models are discrete step functions. [28] discussed the robustness of -NN from the theoretical perspective and showed that the robustness of -NN can approach the Bayesian optimal classifier. To compute an upper bound of the minimum adversarial perturbation (or equivalently, attack), [22] proposed to employ a differentiable substitute for attacking 1-NN models; [25] proposed some heuristic methods and another gradient-based model to attack another kind of continuous substitute of -NN. We will show in Section 3 that they cannot obtain the minimum adversarial perturbation and in experiments that they lead to loose upper bounds. On the other hand, to the best of our knowledge, there is no existing approach on computing the minimum adversarial perturbation or its lower bound, so our work is the first to verify the NN models. Finally, there are some recent work using NN models for defense, including [21, 10]. However, they usually combine NN with neural network models, which are out of the scope of this paper.

## 3 Background and motivation

First, we set up notations for the Nearest Neighbor (NN) classifiers. Assume there are labels in total. We use to denote the database where each is a

-dimensional vector and

is the corresponding label. A -NN classifier maps a test instance to a predicted label. Given a test instance , the classifier will first identify the -nearest neighbors based on the Euclidean distance and then predict the final label by majority voting among .

Next we define the notions of adversarial robustness, attack, and verification. Given a test sample and without loss of generality, we assume it is correctly classified as class- by the NN model. An adversarial perturbation is defined as such that . An attack algorithm aims to find the minimum-norm adversarial perturbation, and its norm is

 ϵ∗={minδ∥δ∥  s.t.  f(z+δ)≠1}. (1)

A verification algorithm aims to find a lower bound such that

 f(z+δ)=1,    ∀∥δ∥≤r.

Clearly, by definition the maximum lower bound will match with the minimum perturbation norm if we have optimal attack and verification. We will mainly focus on norm but will briefly talk about how to extend to and norms later. Also, we will focus on -NN first and then generalize to later.

#### Failure cases of previous attack methods.

At first glance, the minimum adversarial perturbation seems to be easy to compute for the -NN model. For instance, [25] mentioned that the minimum adversarial perturbation has to be on the straight line connecting and one of the training instances belonging to a different class (), so a simple linear time algorithm can solve this problem. Unfortunately, this claim is not true. In Figure 1, we show that the optimal perturbation may not be on the lines connecting two points and furthermore, only checking the line segments can find an arbitrary bad solution. Other previous approaches try to form a continuous approximation of NN classifiers [25, 22, 10], and clearly, they cannot find the optimal perturbation.

#### Connection to Voronoi diagrams and a solution for low-dimensional cases.

In fact, the decision boundary of a 1-NN model can be captured by the Voronoi diagram (see Figure 1(c)). In the Voronoi diagram, each training instance forms a cell, and the decision boundary of the cell is captured by the convex boundary formed by bisections between and its neighbors. One can thus obtain the minimum adversarial perturbation by computing the distances from to all the cells with . However, to compute the distance, we need to check all the faces (captured by one bisection hyperplane) and angles (intersections of more than one bisection hyperplanes) of the cell.

For 2-dimensional space (), it has been shown in [3] that each cell can only have finite faces and angles and there exists a polynomial time algorithm for computing a Voronoi diagram. In general, for -dimensional problems with points, Voronoi diagram computation requires time, which works for low-dimensional problems. However, time complexity grows exponentially with dimension , so in general, it is hard to use this algorithm unless is very small.

## 4 Primal-dual quadratic programming formulation

Usually, finding the minimum adversarial perturbation is hard. Computing minimum adversarial perturbations for ReLU networks and tree ensembles are both NP-hard

[15, 14]. Also, as discussed in the previous section, we can connect it to Voronoi diagram computation, but the solver will require exponential time in dimensionality. So is it NP-hard to compute the minimum adversarial perturbation for -NN? Surprisingly, it is not as we will demonstrate below.

We consider the 1-NN model. For a given instance , if we want to perturb it so that is closer to with than to all class- instances, then the problem of finding the minimum perturbation can be formulated as:

 ϵ(j)=minδ 12δTδ  s.t. ∥z+δ−xj∥2≤∥z+δ−xi∥2,  ∀i,yi=1. (2)

Each constraint can be rewritten as . Therefore (2) becomes

 ϵ(j)=minδ:Aδ+b≥0 {12δTδ}:=P(j)(δ), (3)

where and , for each row with , and respectively ( otherwise). By solving the quadratic programming (QP) problem (3) for each , the final minimum adversarial perturbation norm is . It has been shown that convex quadratic programming can be solved in polynomial time [16], so our formulation leads to a polynomial time algorithm for finding .

### 4.2 Dual quadratic programming problems

We also introduce the dual form of each QP, which is more efficient to solve in practice and will lead to a verification perspective of evaluating adversarial robustness. The dual problem of (3) can be written as

 maxλ≥0{−12λTAATλ−λTb}:=D(j)(λ), (4)

where are the corresponding dual variables. The derivation is easy, but for completeness, we include it in Appendix A.1. The primal-dual relationship connects primal and dual variables:

 δ=ATλ.

Based on the weak duality, we have for any dual feasible solution and primal feasible solution . Furthermore, based on Slater’s condition we can easily show that strong duality holds () if , . 111One can observe that if the condition holds () then a small ball around will be feasible solutions which satisfies Slater’s condition, implying strong convexity. Based on strong duality, we have

 12(ϵ∗)2=minj:yj≠1{P(j)(δ∗)}=minj:yj≠1{maxλ≥0D(j)(λ)}≥minj:yj≠1{D(j)(λ(j))}%withfeasible$λ(j)$, (5)

so any feasible solution leads to a lower bound of the minimum adversarial perturbation. In summary, we conclude the primal-dual relationship between 1-NN attack and verification:

• A primal feasible solution of for any is a successful attack and gives us an upper bound of . Therefore, one can solve QPs with a subset of ; usually a closer to will lead to a smaller adversarial perturbation, so in practice we can sort by the distance to , solve the subproblems one by one, and stop at any time. It will give a valid adversarial perturbation. After solving all the subproblems, the result will be .

• A set of dual feasible solutions will give a lower bound of according to (5). So any heuristic method for setting up a set of dual feasible solutions will give us a lower bound which can be used for robustness verification. After solving all the dual problems exactly, we will get the tightest lower bound, which is also .

#### 1-NN verification.

Here we give an example of how to quickly set up dual variables to give a lower bound of the minimum adversarial perturbation without solving any problem. For a dual problem , consider only having one variable to be nonzero while fixing all the rest variables zero, the optimal closed-form solution will be

 λ(j)i=max(0,−bi∥ai∥2),  D(j)([0,…,0,λ(j)i,0,…,0])=max(−bi,0)22∥ai∥2. (6)

Note that and both can be computed easily, so a guaranteed lower bound of can be computed easily:

 ϵ–=minj:yj≠1(maxi:yi=1max(∥z−xj∥2−∥z−xi∥2,0)2∥xj−xi∥)≤ϵ∗. (7)

This value has an interesting geometrical meaning. See Appendix

A.2 for more details. In general, we can also get improved lower bounds by solving more coordinates for each subproblem.

### 4.3 Solving the QP problems efficiently

Now we discuss how to efficiently solve a series of QP problems in practice. Although we can do this in polynomial time, in practice a naive algorithm is still too slow. Note that we have totally quadratic problems and each QP has variables to solve, so roughly more than time complexity is required for doing this naively. Calling a commercial quadratic programming solver will take seconds for when solving only one QP problem. In the following, we show how to solve “all” QP problems in seconds.

First, we find that a greedy coordinate ascent algorithm can be efficiently applied to solve the dual QP problem (4). This is mainly due to the sparsity of the solution—if is dual optimal, then a nonzero means the primal constraint , so the optimal will be on the bisection of . Therefore, if then the optimal solution is the intersection of bisection hyperplanes, which means is usually small. For instance, on MNIST dataset when we test on 100 subproblems, the average number of is only , with dual variables per subproblem. The sparsity of the solution motivates the use of the greedy coordinate ascent algorithm. Starting from , we maintain the gradient vector with and every time we pick the variable with the largest projected gradient

 i∗=argmaxi|(max(λ+g,0)−λ)i|

and then update a single variable . This is similar to the SMO method proposed for training kernel SVM [23, 6], but since there is no equality constraint we only need to pick one variable at a time. Since there are only a few nonzero s, the algorithm usually converges much quicker than standard quadratic optimization solvers.

Second, we propose a screening rule to remove variables in each dual QP problem (4). There are only a few nonzero variables, and our screening rule will reduce the size of variables before solving the problem. We introduce the following lemma:

###### Lemma 1.

For a specific quadratic problem , the optimal dual solution has if

 −(∥z−xi∥2−∥z−xj∥2)/2+∥xj−xi∥∥δ∗∥<0, (8)

where is the optimal solution.

The proof is in Appendix A.3. Note that checking (8) does not need to solve the QP problem. To conduct the screening rule in (8

), we need to have an estimation of

or its upper bound. A naive upper bound can be used for running the screening rule.

With the methods mentioned above, each dual QP can be solved efficiently. However, there are QPs in total, and solving all of them is still expensive. However, the final solution only depends on the minimum among solutions of all QPs, so can we remove most of the irrelevant QPs?

We can use primal-dual relationship for removing most of the QP problems before solving them. Assume we have a primal solution then the minimum adversarial perturbation norm , so every dual problem with for some can be removed. For a subproblem with respect to , based on (6) we know the subproblem can be removed if

 ¯δT¯δ

for some , thus we can use (9) to remove some unimportant subproblems. In practice, we sort the subproblems in ascending order of and iteratively run the screening rule after solving one more subproblem. As a result, most of the subproblems can be removed without solving them, and we achieve significant speedup. Our overall algorithm is illustrated in Algorithm 1.

### 4.4 Extending to ℓ1 and ℓ∞ norms

Sometimes people are interested in finding the minimum or norm adversarial perturbation (replacing the norm in (1

)). Those can be solved similarly using our framework but will require linear programming instead of quadratic programming. For example, the minimum

-norm adversarial perturbation can be formulated as

 ϵ(j)=minδ v   s.t.   Aδ+b≥0,  v≥δi≥−v  ∀i=1,…,d.

A similar formulation can be done for the case. This can also be solved efficiently by linear programming solvers and the primal-dual relationship also holds.

### 4.5 Extending beyond 1-NN

We can extend our approach to -NN with by adding more constraints. Taking the -NN model and binary classification as an example, we can list all the possible combinations of and then solve the following QP problem to force to be closer to than to all the class- instances except :

 ϵ(j1,j2,j3)=minδ 12δTδ   s.t.  ∥z+δ−xj∥2≤∥z+δ−xi∥2,  ∀i,i≠j3,yi=1,   j∈{j1,j2}.

There will be constraints so will be more expensive to solve. For general , we can write a similar formulation with constraints, but since the QP still has a sparse solution, greedy coordinate ascent can still solve a subproblem efficiently. Using this we can still obtain an upper and lower bound, corresponding to attack and verification. For an upper bound (attack), we just need to heuristically choose some tuples according to the distance to and solve some QPs (more details in Appendix A.4). For a lower bound, we can use the similar formulation to (7) as below:

#### Efficient verification for K>1.

We can apply (5) efficiently even for large . Taking the case as an example, let then with some simple derivation we can get the verification bound for case

 min(j1,j2,j3):yj1=2,yj2=2,yj3=1(maxi≠j3(max(Ci,j1,Ci,j2)))≥min(j1,j2):yj1=2,yj2=2max(Dj1,Dj2)

where , which is the second largest value among for all . Therefore, we just need to choose the second smallest of among . Note that this can be generalized to a general case, where the verification lower bound becomes:

 ϵ–:={kthminj:yj=2 (kthmaxi:yi=1 Cij )}≤ϵ∗,  k=(K+1)/2, (10)

which can be computed efficiently with time complexity independent to .

## 5 Experiments

We show main results in Section 5.1, and analyze efficiency of our algorithm in Section 5.2. All experiments are run on a cloud server with one Intel E5-2650V4 CPU and one NVIDIA V100 GPU.

### 5.1 Comparison of adversarial perturbations

We show that our formulation leads to better attack and verification algorithms. Note that our QP framework leads to the following proposed algorithms for exact computation, verification, and attack:

• Exact: computes the exact minimum adversarial perturbation for 1-NN via Algorithm 1.

• Verifier: computes a lower bound for 1-NN via (7) and for -NN via (10).

• QP- and QP-: compute upper bounds (attack) for 1-NN via Algorithm 1 but only iterate over the top- and top- QP problems respectively.

• QP-greedy: computes an upper bound for -NN by heuristically choosing QP subproblems (Appendix A.4).

Note that there is no existing algorithm for computing the exact minimum adversarial perturbation and no existing verification method for -NN that can compute a lower bound. Therefore we are only able to compare with the following attack methods:

• Naive- and Naive- [1, 25]: compute upper bounds for 1-NN by moving towards a nearby other-class instance (belonging to a class different from the one of the test instance). Naive- repeats the process for times and chooses the best perturbation. For , Naive-1 moves towards a nearby size other-class cluster.

• Mean [25]: computes an upper bound for -NN by moving towards a class mean. The target class is chosen by the class mean distance to the test instance.

• Substitute [22]: computes an upper bound for 1-NN by attacking a smoothed variant of NN.

All attack methods are tuned to have attack success rates, such that the perturbation is strictly the upper bound for the minimum adversarial perturbation.

#### Perturbations for 1-NN.

Experiments are performed on MNIST [17] and Fashion-MNIST [33]. As Table 1 shows, Algorithm 1 (Exact) can efficiently compute the minimum adversarial perturbation. Verifier can compute a reasonable lower bound without solving any QP problems exactly. QP- and QP- are efficient and effective attack methods by iterating over only a few QP problems.

#### Perturbations with larger K.

Experiments are performed on Binary-MNIST, where label and label are used. Verifier and QP-greedy compute tight lower bounds and upper bounds respectively as shown in Table 2. More results are in Appendix A.5.

#### Comparing robust error under ℓ∞-norm perturbation.

In this experiment, we compare the robustness of 1-NN model with the CNN model on two datasets in Table 3. The CNN model has two convolutional layers and two fully connected layers with ReLU activations. We observe that neural nets have better test error, which is known in the literature. However, if we compare the error under the same amount of attack, 1-NN outperforms neural nets on these two datasets. Furthermore, since 1-NN is easy to verify using our approach, and it is NP-hard to compute the minimum adversarial perturbation for a neural network, 1-NN can have much better verifiable robust error than neural nets. Note that we use one of the state-of-the-art verification methods in [31]

for computing the verifiable robust error of neural nets. We do not claim 1-NN is a better model than neural nets. For more complex datasets such as CIFAR or ImageNet, the 1-NN method will lead to bad clean error, so it is not comparable with neural nets. However, we think this experiment suggests that for some simple data, NN models could be a better choice in terms of the robustness error.

### 5.2 Efficiency of our algorithm

We already show Algorithm 1 is efficient in Table 1. It has three components: sorting, screening rules for reducing the number of QPs, and the greedy coordinate ascent solver for each QP. We leave the experiments about sorting in Appendix A.6 and talk about the other two in detail.

#### Screening.

In the MNIST case, for every test instance, we have to solve about (all other-class instances) QP problems without screening. While with sorting and screening, only QP problems on average are left to solve. Therefore screening improves efficiency significantly. The screening parameter (number of s chosen for each in (9)) is also an important parameter for efficiency. There is a trade-off between the number of screened subproblems and the screening overheads controlled by . We plot the tradeoff on MNIST in Figure 2. This shows a very small is enough, and we choose 8 for our experiments.

#### Greedy coordinate ascent.

Due to the sparsity of each QP problem, the greedy coordinate ascent solver is much more efficient than other standard QP solvers. To verify this, we compare greedy coordinate ascent with SCS [20], CVXOPT and ECOS [9] for solving these QP problems. We fix everything the same (with the same screening rule and sorting technique) while only change the QP solver. Since it is difficult for standard QP solvers to deal with high dimensional problems, training samples of MNIST are used. The results are presented in Figure 3. Greedy coordinate ascent is faster than other solvers by more than 60 times for computing -NN robustness.

## 6 Conclusion

In this paper, we show that computing the minimum adversarial perturbation of -NN models can be formulated as a series of quadratic programming problems. This framework is the first algorithm that can compute the minimum adversarial perturbation, and we propose an efficient solver such that the computation time is comparable with and often faster than previous attack algorithms. Furthermore, our framework also motivates the first algorithm for verifying -NN robustness from the dual aspect.

## References

• [1] Laurent Amsaleg, James Bailey, Dominique Barbe, Sarah M. Erfani, Michael E. Houle, Vinh Nguyen, and Milos Radovanovic. The vulnerability of learning to adversarial perturbation increases with intrinsic dimensionality. In IEEE Workshop on Information Forensics and Security, pages 1–6, 2017.
• [2] Anish Athalye, Nicholas Carlini, and David Wagner. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In

International Conference on Machine Learning

, 2018.
• [3] Franz Aurenhammer and Rolf Klein. Voronoi diagrams. Handbook of computational geometry, 5(10):201–290, 1999.
• [4] Battista Biggio and Fabio Roli.

Wild patterns: Ten years after the rise of adversarial machine learning.

Pattern Recognition, 84:317–331, 2018.
• [5] Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy, pages 39–57, 2017.
• [6] Chih-Chung Chang and Chih-Jen Lin.

LIBSVM: A library for support vector machines.

ACM Transactions on Intelligent Systems and Technology, 2(3):27, 2011.
• [7] Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. ZOO: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In

ACM Workshop on Artificial Intelligence and Security

, pages 15–26, 2017.
• [8] Minhao Cheng, Thong Le, Pin-Yu Chen, Jinfeng Yi, Huan Zhang, and Cho-Jui Hsieh. Query-efficient hard-label black-box attack: An optimization-based approach. In International Conference on Learning Representations, 2019.
• [9] Alexander Domahidi, Eric Chu, and Stephen Boyd. ECOS: An SOCP solver for embedded systems. In European Control Conference, pages 3071–3076, 2013.
• [10] Abhimanyu Dubey, Laurens van der Maaten, Zeki Yalniz, Yixuan Li, and Dhruv Mahajan. Defense against adversarial images using web-scale nearest-neighbor search. In

IEEE Conference on Computer Vision and Pattern Recognition

, 2019.
• [11] Krishnamurthy Dvijotham, Robert Stanforth, Sven Gowal, Timothy Mann, and Pushmeet Kohli. A dual approach to scalable verification of deep networks. In Annual Conference on Uncertainty in Artificial Intelligence, pages 162–171, 2018.
• [12] Timon Gehr, Matthew Mirman, Dana Drachsler-Cohen, Petar Tsankov, Swarat Chaudhuri, and Martin Vechev. Ai2: Safety and robustness certification of neural networks with abstract interpretation. In IEEE Symposium on Security and Privacy, pages 3–18, 2018.
• [13] Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2015.
• [14] Alex Kantchelian, JD Tygar, and Anthony Joseph. Evasion and hardening of tree ensemble classifiers. In International Conference on Machine Learning, pages 2387–2396, 2016.
• [15] Guy Katz, Clark Barrett, David L Dill, Kyle Julian, and Mykel J Kochenderfer. Reluplex: An efficient SMT solver for verifying deep neural networks. In International Conference on Computer Aided Verification, pages 97–117, 2017.
• [16] Mikhail K Kozlov, Sergei P Tarasov, and Leonid G Khachiyan. The polynomial solvability of convex quadratic programming. USSR Computational Mathematics and Mathematical Physics, 20(5):223–228, 1980.
• [17] Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.

Towards deep learning models resistant to adversarial attacks.

In International Conference on Learning Representations, 2018.
• [19] Matthew Mirman, Timon Gehr, and Martin Vechev. Differentiable abstract interpretation for provably robust neural networks. In International Conference on Machine Learning, pages 3575–3583, 2018.
• [20] Brendan O’Donoghue, Eric Chu, Neal Parikh, and Stephen Boyd. Conic optimization via operator splitting and homogeneous self-dual embedding. Journal of Optimization Theory and Applications, 169(3):1042–1068, 2016.
• [21] Nicolas Papernot and Patrick McDaniel. Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning. CoRR, abs/1803.04765, 2018.
• [22] Nicolas Papernot, Patrick D. McDaniel, and Ian J. Goodfellow. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. CoRR, abs/1605.07277, 2016.
• [23] John Platt. Sequential minimal optimization: A fast algorithm for training support vector machines. Technical report, 1998.
• [24] Hadi Salman, Greg Yang, Huan Zhang, Cho-Jui Hsieh, and Pengchuan Zhang. A convex relaxation barrier to tight robust verification of neural networks. CoRR, abs/1902.08722, 2019.
• [25] Chawin Sitawarin and David Wagner. On the robustness of deep k-nearest neighbors. CoRR, abs/1903.08333, 2019.
• [26] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations, 2013.
• [27] Shiqi Wang, Kexin Pei, Justin Whitehouse, Junfeng Yang, and Suman Jana. Efficient formal safety analysis of neural networks. In Advances in Neural Information Processing Systems, pages 6367–6377, 2018.
• [28] Yizhen Wang, Somesh Jha, and Kamalika Chaudhuri. Analyzing the robustness of nearest neighbors to adversarial examples. In International Conference on Machine Learning, pages 5120–5129, 2018.
• [29] Tsui-Wei Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Luca Daniel, Duane Boning, and Inderjit Dhillon. Towards fast computation of certified robustness for relu networks. In International Conference on Machine Learning, pages 5273–5282, 2018.
• [30] Tsui-Wei Weng, Huan Zhang, Pin-Yu Chen, Jinfeng Yi, Dong Su, Yupeng Gao, Cho-Jui Hsieh, and Luca Daniel. Evaluating the robustness of neural networks: An extreme value theory approach. In International Conference on Learning Representations, 2018.
• [31] Eric Wong and J Zico Kolter. Provable defenses against adversarial examples via the convex outer adversarial polytope. In International Conference on Machine Learning, 2018.
• [32] Eric Wong, Frank Schmidt, Jan Hendrik Metzen, and J Zico Kolter. Scaling provable adversarial defenses. In Advances in Neural Information Processing Systems, pages 8400–8409, 2018.
• [33] Han Xiao, Kashif Rasul, and Roland Vollgraf. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. CoRR, abs/1708.07747, 2017.
• [34] Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, and Luca Daniel.

Efficient neural network robustness certification with general activation functions.

In Advances in Neural Information Processing Systems, pages 4939–4948, 2018.
• [35] Huan Zhang, Pengchuan Zhang, and Cho-Jui Hsieh. RecurJac: An efficient recursive algorithm for bounding Jacobian matrix of neural networks and its applications. In AAAI Conference on Artificial Intelligence, 2019.

## Appendix A Appendixes

### a.1 Derivation of the dual form

Consider the primal problem in (3). The Lagrangian can be written as

 L(δ,λ)=12δTδ−λT(Aδ+b)

The dual problem is then

 maxλ≥0minδL(δ,λ)

Taking derivative of Lagrangian we get

 ∂∂δL=δ−ATλ=0,

which gives us the primal-dual relationship . Substitute this back to the dual problem we get

 maxλ≥0−12λTAATλ−λTb.

### a.2 Geometric meaning of our verification bound

Here we discuss the geometric meaning of the following verification bound for -NN (derived in (7)):

 ϵ–=minj:yj≠1(maxi:yi=1max(∥z−xj∥2−∥z−xi∥2,0)2∥xj−xi∥)≤ϵ∗. (11)

The inner value is the distance between to the bisection between and , which means if we want to perturb to make it closer to than , the perturbation must be larger than the inner value. Then, if we want to perturb such that the nearest neighbor is , we need to by-pass all the bisections so we need to take the operation among all the distances to bisections. And a lower bound of can be computed by taking minimum over all the .

### a.3 Proof of Lemma 1

###### Proof.

By definition we have , so

 ∇D(j)(λ∗) =−AATλ∗−b =−Aδ∗−b.
 ∇D(j)i(λ∗) =−aTiδ∗−bi =(xi−xj)Tδ∗−∥z−xi∥2−∥z−xj∥22 ≤∥xi−xj∥∥δ∗∥−∥z−xi∥2−∥z−xj∥22.

Therefore, when (8) holds, by KKT conditions of the dual problem we know .

### a.4 Attack for the K>1 case

Note that the problem is equivalent to forming where the -th row of is and for all class- instances . We can first choose to be the class- instance closest to , then try different (sorting according to the distance to ). After solving each pair of , we can try to remove one row of and which corresponds to . Note that only removing with nonzero can change the result and there are only few nonzero s, so we could simply try all of them.

A greedy and more efficient version is illustrated in Algorithm 2.

### a.5 More experimental results for K-NN verification

Since time complexity of our verification method for -NN is independent to , we can efficiently compute lower bounds of the minimum adversarial perturbation for a large . Experimental results on Binary-MNIST are illustrated in Figure 4.

The verification method can be extended to the multi-class case. A simple way is just taking the true label of the test instance as positive (label ), and the others as negative (label ). It could be easily verified that in (10) is still a lower bound. Experimental results on MNIST are illustrated on Figure 5.

### a.6 Effect of sorting in our Algorithm 1

We study whether sorting improves efficiency. All training data of MNIST are used as training instances. correctly classified test instances are sampled randomly. All components of Algorithm 1 are employed, and , i.e., positive instances are used for screening. We report the mean number of subproblems and the mean runtime of the test instances. No extra parallel mechanism is employed for test instances. As Table 4 shows, sorting reduces the number of subproblems and improves efficiency.