On the Geometry of Adversarial Examples

11/01/2018 ∙ by Marc Khoury, et al. ∙ berkeley college 50

Adversarial examples are a pervasive phenomenon of machine learning models where seemingly imperceptible perturbations to the input lead to misclassifications for otherwise statistically accurate models. We propose a geometric framework, drawing on tools from the manifold reconstruction literature, to analyze the high-dimensional geometry of adversarial examples. In particular, we highlight the importance of codimension: for low-dimensional data manifolds embedded in high-dimensional space there are many directions off the manifold in which to construct adversarial examples. Adversarial examples are a natural consequence of learning a decision boundary that classifies the low-dimensional data manifold well, but classifies points near the manifold incorrectly. Using our geometric framework we prove (1) a tradeoff between robustness under different norms, (2) that adversarial training in balls around the data is sample inefficient, and (3) sufficient sampling conditions under which nearest neighbor classifiers and ball-based adversarial training are robust.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 16

page 23

page 27

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep learning at scale has led to breakthroughs on important problems in computer vision (Krizhevsky et al. (2012)

), natural language processing (

Wu et al. (2016)), and robotics (Levine et al. (2015)). Shortly thereafter, the interesting phenomena of adversarial examples was observed. A seemingly ubiquitous property of machine learning models where perturbations of the input that are imperceptible to humans reliably lead to confident incorrect classifications (Szegedy et al. (2013), Goodfellow et al. (2014)). What has ensued is a standard story from the security literature: a game of cat and mouse where defenses are proposed only to be quickly defeated by stronger attacks (Athalye et al. (2018)). This has led researchers to develop methods which are provably robust under specific attack models (Madry et al. (2018), Wong and Kolter (2018), Sinha et al. (2018), Raghunathan et al. (2018)). As machine learning proliferates into society, including security-critical settings like health care (Esteva et al. (2017)) or autonomous vehicles (Codevilla et al. (2018)), it is crucial to develop methods that allow us to understand the vulnerability of our models and design appropriate counter-measures.

In this paper, we propose a geometric framework for analyzing the phenomenon of adversarial examples. We leverage the observation that datasets encountered in practice exhibit low-dimensional structure despite being embedded in very high-dimensional input spaces. This property is colloquially referred to as the “Manifold Hypothesis”: the idea that low-dimensional structure of ‘real’ data leads to tractable learning. We model data as being sampled from class-specific low-dimensional manifolds embedded in a high-dimensional space. We consider a threat model where an adversary may choose

any point on the data manifold to perturb by in order to fool a classifier. In order to be robust to such an adversary, a classifier must be correct everywhere in an -tube around the data manifold. Observe that, even though the data manifold is a low-dimensional object, this tube has the same dimension as the entire space the manifold is embedded in. Our analysis argues that adversarial examples are a natural consequence of learning a decision boundary that classifies all points on a low-dimensional data manifold correctly, but classifies many points near the manifold incorrectly. The high codimension, the difference between the dimension of the data manifold and the dimension of the embedding space, is a key source of the pervasiveness of adversarial examples.

Our paper makes the following contributions. First, we develop a geometric framework, inspired by the manifold reconstruction literature, that formalizes the manifold hypothesis described above and our attack model. Second, we highlight the role codimension plays in vulnerability to adversarial examples. As the codimension increases, there are an increasing number of directions off the data manifold in which to construct adversarial perturbations. Prior work has attributed vulnerability to adversarial examples to input dimension (Gilmer et al. (2018)). This is the first work that investigates the role of codimension in adversarial examples. Interestingly, we find that different classification algorithms are less sensitive to changes in codimension. Third, we apply this framework to prove the following results: (1) we show that the choice of norm to restrict an adversary is important in that there exists a tradeoff between being robust to different norms: we present a classification problem where improving robustness under the norm requires a loss of in robustness to the norm; (2) we show that a common approach, training against adversarial examples drawn from balls around the training set, is insufficient to learn robust decision boundaries with realistic amounts of data; and (3) we show that nearest neighbor classifiers do not suffer from this insufficiency, due to geometric properties of their decision boundary away from data, and thus represent a potentially robust classification algorithm. Finally we provide experimental evidence on synthetic datasets and MNIST that support our theoretical results.

2 Related Work

This paper approaches the problem of adversarial examples using techniques and intuition from the manifold reconstruction literature. Both fields have a great deal of prior work, so we focus on only the most related papers here.

2.1 Adversarial Examples

Some previous work has considered the relationships between adversarial examples and high dimensional geometry. Franceschi et al. (2018) explore the robustness of classifiers to random noise in terms of distance to the decision boundary, under the assumption that the decision boundary is locally flat. The work of Gilmer et al. (2018) experimentally evaluated the setting of two concentric under-sampled -spheres embedded in , and concluded that adversarial examples occur on the data manifold. In contrast, we present a geometric framework for proving robustness guarantees for learning algorithms, that makes no assumptions on the decision boundary. We carefully sample the data manifold in order to highlight the importance of codimension; adversarial examples exist even when the manifold is perfectly classified. Additionally we explore the importance of the spacing between the constituent data manifolds, sampling requirements for learning algorithms, and the relationship between model complexity and robustness.

Wang et al. (2018) explore the robustness of -nearest neighbor classifiers to adversarial examples. In the setting where the Bayes optimal classifier is uncertain about the true label of each point, they show that -nearest neighbors is not robust if is a small constant. They also show that if , then -nearest neighbors is robust. Using our geometric framework we show a complementary result: in the setting where each point is certain of its label, -nearest neighbors is robust to adversarial examples.

The decision and medial axes defined in Section 3 are maximum margin decision boundaries. Hard margin SVMs define define a linear separator with maximum margin, maximum distance from the training data (Cortes and Vapnik (1995)). Kernel methods allow for maximum margin decision boundaries that are non-linear by using additional features to project the data into a higher-dimensional feature space (Shawe-Taylor and Cristianini (2004)). The decision and medial axes generalize the notion of maximum margin to account for the arbitrary curvature of the data manifolds. There have been attempts to incorporate maximum margins into deep learning (Sun et al. (2016), Liu et al. (2016), Liang et al. (2017), Elsayed et al. (2018)

), often by designing loss functions that encourage large margins at either the output (

Sun et al. (2016)) or at any layer (Elsayed et al. (2018)). In contrast, the decision axis is defined on the input space and we use it as an analysis tool for proving robustness guarantees.

2.2 Manifold Reconstruction

Manifold reconstruction is the problem of discovering the structure of a -dimensional manifold embedded in , given only a set of points sampled from the manifold. A large vein of research in manifold reconstruction develops algorithms that are provably good: if the points sampled from the underlying manifold are sufficiently dense, these algorithms are guaranteed to produce a geometrically accurate representation of the unknown manifold with the correct topology. The output of these algorithms is often a simplicial complex, a set of simplices such as triangles, tetrahedra, and higher-dimensional variants, that approximate the unknown manifold. In particular these algorithms output subsets of the Delaunay triangulation, which along with their geometric dual the Voronoi diagram, have properties that aid in proving geometric and topological guarantees (Edelsbrunner and Shah (1997)).

The field first focused on curve reconstruction in (Amenta et al. (1998)) and subsequently in (Dey and Kumar (1999)). Soon after algorithms were developed for surface reconstruction in , both in the noise-free setting (Amenta and Bern (1999), Amenta et al. (2002)) and in the presence of noise (Dey and Goswami (2004)). We borrow heavily from the analysis tools of these early works, including the medial axis and the reach. However we emphasize that we have adapted these tools to the learning setting. To the best of our knowledge, our work is the first to consider the medial axis under different norms.

In higher-dimensional embedding spaces (large ), manifold reconstruction algorithms face the curse of dimensionality. In particular, the Delaunay triangulation, which forms the bedrock of algorithms in low-dimensions, of vertices in can have up to

simplices. To circumvent the curse of dimensionality, algorithms were proposed that compute subsets of the Delaunay triangulation restricted to the

-dimensional tangent spaces of the manifold at each sample point (Boissonnat and Ghosh (2014)). Unfortunately, progress on higher-dimensional manifolds has been limited due to the presence of so-called “sliver” simplices, poorly shaped simplices that cause in-consistences between the local triangulations constructed in each tangent space (Cheng et al. (2005), Boissonnat and Ghosh (2014)). Techniques that provably remove sliver simplices have prohibitive sampling requirements (Cheng et al. (2000), Boissonnat and Ghosh (2014)). Even in the special case of surfaces () embedded in high dimensions (), algorithms with practical sampling requirements have only recently been proposed (Khoury and Shewchuk (2016)). Our use of tubular neighborhoods as a tool for analysis is borrowed from Dey et al. (2005) and Khoury and Shewchuk (2016).

In this paper we are interested in learning robust decision boundaries, not reconstructing the underlying data manifolds, and so we avoid the use of Delaunay triangulations and their difficulties entirely. In Section 5 we present robustness guarantees for two learning algorithms in terms of a sampling condition on the underlying manifold. These sampling requirements scale with the dimension of the underlying manifold , not with the dimension of the embedding space .

3 The Geometry of Data

We model data as being sampled from a set of low-dimensional manifolds (with or without boundary) embedded in a high-dimensional space . We use to denote the dimension of a manifold . The special case of a -manifold is called a curve, and a -manifold is a surface. The codimension of is , the difference between the dimension of the manifold and the dimension of the embedding space. The “Manifold Hypothesis” is the observation that in practice, data is often sampled from manifolds, usually of high codimension.

In this paper we are primarily interested in the classification problem. Thus we model data as being sampled from class manifolds , one for each class. When we wish to refer to the entire space from which a dataset is sampled, we refer to the data manifold . We often work with a finite sample of points, , and we write . Each sample point has an accompanying class label indicating which manifold the point is sampled from.

Consider a -ball centered at some point and imagine growing by increasing its radius starting from zero. For nearly all starting points , the ball eventually intersects one, and only one, of the ’s. Thus the nearest point to on , in the norm , lies on . (Note that the nearest point on need not be unique.)

The decision axis of is the set of points such that the boundary of intersects two or more of the , but the interior of does not intersect at all. In other words, the decision axis is the set of points that have two or more closest points, in the norm , on distinct class manifolds. See Figure 1. The decision axis is inspired by the medial axis, which was first proposed by Blum (1967) in the context of image analysis and subsequently modified for the purposes of curve and surface reconstruction by Amenta et al. (1998; 2002). We have modified the definition to account for multiple class manifolds and have renamed our variant in order to avoid confusion in the future.

The decision axis can intuitively be thought of as a decision boundary that is optimal in the following sense. First, separates the class manifolds when they do not intersect (Lemma 8). Second, each point of is as far away from the class manifolds as possible in the norm . As shown in the leftmost example in Figure 1, in the case of two linearly separable circles of equal radius, the decision axis is exactly the line that separates the data with maximum margin. For arbitrary manifolds, generalizes the notion of maximum margin to account for the arbitrary curvature of the class manifolds.

Figure 1: Examples of the decision axis , shown here in green, for different data manifolds. Intuitively, the decision axis captures an optimal decision boundary between the data manifolds. It’s optimal in the sense that each point on the decision axis is as far away from each data manifold as possible. Notice that in the first example, the decision axis coincides with the maximum margin line.

Let be any set. The reach of is defined as . When is compact, the reach is achieved by the point on that is closest to under the norm. We will drop from the notation when it is understood from context.

Finally, an -tubular neighborhood of is defined as . That is, is the set of all points whose distance to under the metric induced by is less than . Note that while is -dimensional, is always -dimensional. Tubular neighborhoods are how we rigorously define adversarial examples. Consider a classifier for . An -adversarial example is a point such that . A classifier is robust to all -adversarial examples when correctly classifies not only , but all of . Thus the problem of being robust to adversarial examples is rightly seen as one of generalization. In this paper we will be primarily concerned with exploring the conditions under which we can provably learn a decision boundary that correctly classifies . When , the decision axis is one decision boundary that correctly classifies (Corollary 10). Throughout the remainder of the paper we will drop the in from the notation, instead writing ; the norm will always be clear from context.

The geometric quantities defined above can be defined more generally for any distance metric . In this paper we will focus exclusively on the metrics induced by the norms for . The decision axis under is in general not identical to the decision axis under . In Section 4 we will prove that since is not identical to there exists a tradeoff in the robustness of any decision boundary between the two norms.

4 A Provable Tradeoff in Robustness Between Norms

Schott et al. (2018) explore the vulnerability of robust classifiers to attacks under different norms. In particular, they take the robust pretrained classifier of Madry et al. (2018), which was trained to be robust to -perturbations, and subject it to and attacks. They show that accuracy drops to under attacks and to under . Here we explain why poor robustness under the norm should be expected.

We say a decision boundary for a classifier is -robust in the norm if . In words, starting from any point , a perturbation must have -norm greater than to cross the decision boundary. The most robust decision boundary to -perturbations is . In Theorem 1 we construct a learning setting where is distinct from . Thus, in general, no single decision boundary can be optimally robust in all norms.

Theorem 1.

Let be two concentric -spheres with radii respectively. Let and let be the and decision axes of . Then . Furthermore .

Proof.

The decision axis under , , is just the -sphere with radius . However, is not identical to in this setting; in fact most of approaches as increases.

The geometry of a -ball centered at with radius is that of a hypercube centered at with side length . To find a point on we place tangent to the north pole of so that the corners of touch . The north pole has coordinate representation , the center , and a corner of can be expressed as . Additionally we have the constraint that since . Then we can solve for as

where the last step follows from the quadratic formula and the fact that . For fixed , the value scales as . It follows that . ∎

From Theorem 1 we conclude that the minimum distance from to under the norm is upper bounded as . If a classifier is trained to learn , an adversary, starting on , can construct an adversarial example for a perturbation as small as . Thus we should expect to be less robust to -perturbations. Figure 2 verifies this result experimentally.

Figure 2: As the dimension increases, the decreases, and so an robust classifier is less robust to attacks. The dashed lines are placed at , where our theoretical results suggest we should start finding adversarial examples. We use the robust loss of Wong and Kolter (2018)

We expect that is the common case in practice. For example, Theorem 1 extends immediately to concentric cylinders and intertwined tori by considering -dimensional planar cross-sections. In general, we expect that in situations where a -dimensional cross-section with has nontrivial curvature.

Theorem 1 is important because, even in recent literature, researchers have attributed this phenomena to overfitting. Schott et al. (2018) state that “the widely recognized and by far most successful defense by Madry et al. (1) overfits on the metric (it’s highly susceptible to and perturbations)” (emphasis ours). We disagree; the Madry et al. (2018) classifier performed exactly as intended. It learned a decision boundary that is robust under , which we have shown is quite different from the most robust decision boundary under .

Interestingly, the proposed models of Schott et al. (2018) also suffer from this tradeoff. Their model ABS has accuracy to attacks but drops to for . Similarly their model ABS Binary has accuracy to attacks but drops to for attacks.

We reiterate, in general, no single decision boundary can be optimally robust in all norms.

5 Provably Robust Classifiers

Adversarial training, the process of training on adversarial examples generated in a -ball around the training data, is a very natural approach to constructing robust models (Goodfellow et al. (2014), Madry et al. (2018)). In our notation this corresponds to training on samples drawn from for some . While natural, we show that there are simple settings where this approach is much less sample-efficient than other classification algorithms, if the only guarantee is correctness in .

Define a learning algorithm with the property that, given a training set sampled from a manifold , outputs a model such that for every with label , and every , . Here denotes the ball centered at of radius in the relevant norm. That is, learns a model that outputs the same label for any -perturbation of up to as it outputs for . is our theoretical model of adversarial training (Goodfellow et al. (2014), Madry et al. (2018)). Theorem 2 states that is sample inefficient in high codimensions.

Theorem 2.

There exists a classification algorithm that, for a particular choice of , correctly classifies using exponentially fewer samples than are required for to correctly classify .

Theorem 2 follows from Theorems 3 and 4. In Theorems 3 and 4 we will prove that a nearest neighbor classifier is one such classification algorithm. Nearest neighbor classifiers are naturally robust in high codimensions because the Voronoi cells of are elongated in the directions normal to when is dense (Dey (2007)).

Before we state Theorem 3 we must introduce a sampling condition on . A -cover of a manifold in the norm is a finite set of points such that for every there exists such that . Theorem 3 gives a sufficient sampling condition for to correctly classify for all manifolds . Theorem 3 also provides a sufficient sampling condition for a nearest neighbor classifier to correctly classify , which is substantially less dense than that of . Thus different classification algorithms have different sampling requirements in high codimensions.

Theorem 3.

Let be a -dimensional manifold and let for any . Let be a nearest neighbor classifier and let be the output of a learning algorithm as described above. Let denote the training sets for and respectively. We have the following sampling guarantees:

  1. If is a -cover for then correctly classifies .

  2. If is a -cover for then correctly classifies .

Proof.

Here we use to denote the metric induced by the norm. We begin by proving (1). Let be any point in . Suppose without loss of generality that for some class . The distance from to any other data manifold , and thus any sample on , is lower bounded by . See Figure 3. It is then both necessary and sufficient that there exists a such that for . (Necessary since a properly placed sample on can achieve the lower bound on .) The distance from to the nearest sample on is for some . The question is how large can we allow to be and still guarantee that correctly classifies ? We need

which implies that . It follows that a -cover with is sufficient, and in some cases necessary, to guarantee that correctly classifies .

Next we prove (2). As before let . It is both necessary and sufficient for for some sample to guarantee that , by definition of . The distance to the nearest sample on is for some . Thus it suffices that . ∎

Figure 3: Proof of Theorem 3. The distance from a query point to , and thus the closest incorrectly labeled sample, is lower bounded by the distance necessary to reach the medial axis plus the distance from to .

In Appendix B we provide additional robustness results for nearest neighbors including: (1) a similar robustness guarantee as in Theorem 3 when noise is introduced into the samples and (2) that the decision boundary of approaches the decision axis as the sample density increases.

The bounds on in Theorem 3 are sufficient, but they are not always necessary. There exist manifolds where the bounds in Theorem 3 are pessimistic, and less dense samples corresponding to larger values of would suffice.

Next we will show a setting where bounds on similar to those in Theorem 3 are necessary. In this setting, the difference of a factor of in between the sampling requirements of and leads to an exponential gap between the sizes of and necessary to achieve the same amount of robustness.

Define ; that is is a subset of the ---plane bounded between the coordinates . Similarly define . Note that lies in the subspace ; thus , where is the decision axis of . In the norm we can show that the gap in Theorem 3 is necessary for . Furthermore the bounds we derive for -covers for for both and are tight. Combined with well-known properties of covers, we get that the ratio is exponential in .

Theorem 4.

Let as described above. Let be minimum training sets necessary to guarantee that and correctly classify . Then we have that

(1)
Proof.

Let . Since is flat, the distance to from to the nearest sample is bounded as . For we need that , and so it suffices that . In this setting, this is also necessary; should be any larger a property placed sample on can claim in its Voronoi cell.

Similarly for we need that , and so it suffices that . In this setting, this is also necessary; should be any larger, lies outside of every -ball and so is free to learn a decision boundary that misclassifies .

Let denote the size of the minimum -cover of . Since is flat (has no curvature) and since the intersection of with a -ball centered at a point on is a -ball, a standard volume argument can be applied in the affine subspace to conclude that . So we have

Since is constant in both settings, the factor as well as the constant factors hidden by cancel. (Note that we are using the fact that have finite -dimensional volume.) The inequality follows from the fact that the expression is monotonically decreasing on the interval and takes value at . ∎

We have shown that both and nearest neighbor classifiers learn robust decision boundaries when provided sufficiently dense samples of . However there are settings where nearest neighbors is exponentially more sample-efficient than in achieving the same amount of robustness. We experimentally verify these theoretical results in Section 8.1.

6 is a Poor Model of

Figure 4: To construct an -cover we place sample points, shown here in black, along a regular grid with spacing . The blue points are the furthest points of from the sample. To cover we need .

Madry et al. (2018) suggest training a robust classifier with the help of an adversary which, at each iteration, produces -perturbations around the training set that are incorrectly classified. In our notation, this corresponds to learning a decision boundary that correctly classifies . We believe this approach is insufficiently robust in practice, as is often a poor model for . In this section, we show that the volume is often a vanishingly small percentage of . These results shed light on why the ball-based learning algorithm defined in Section 5 is so much less sample-efficient than nearest neighbor classifiers. In Section 8.1 we experimentally verify these observations by showing that in high-dimensional space it is easy to find adversarial examples even after training against a strong adversary. For the remainder of this section we will consider the norm.

Theorem 5.

Let be a -dimensional manifold embedded in such that . Let be a finite set of points sampled from . Suppose that where is the medial axis of , defined as in Dey (2007). Then the percentage of covered by is upper bounded by

(2)

As the codimension , Equation 2 approaches , for any fixed .

Proof.

Assuming the balls centered on the samples in are disjoint we get the upper bound

(3)

This is identical to the reasoning in Equation 5.

The medial axis of is defined as the closure of the set of all points in that have two or more closest points on in the norm . The medial axis is similar to the decision axis , except that the nearest points do not need to be on distinct class manifolds. For , we have the lower bound

(4)

Combining Equations 3 and 4 gives the result. To get the asymptotic result we apply Stirling’s approximation to get

The last step follows from the fact that , where is the base of the natural logarithm. ∎

In high codimension, even moderate under-sampling of leads to a significant loss of coverage of because the volume of the union of balls centered at the samples shrinks faster than the volume of . Theorem 5 states that in high codimensions the fraction of covered by goes to . Almost nothing is covered by for training set sizes that are realistic in practice. Thus is a poor model of , and high classificaiton accuracy on does not imply high accuracy in .

Note that an alternative way of defining the ratio is as . This is equivalent in our setting since and so .

For the remainder of the section we provide intuition for Theorem 5 by considering the special case of -dimensional planes. Define ; that is is a subset of the ---plane bounded between the coordinates . Recall that a -cover of a manifold in the norm is a finite set of points such that for every there exists such that . It is easy to construct an explicit -cover of : place sample points at the vertices of a regular grid, shown in Figure 4 by the black vertices. The centers of the cubes of this regular grid, shown in blue in Figure 4, are the furthest points from the samples. The distance from the vertices of the grid to the centers is where is the spacing between points along an axis of the grid. To construct a -cover we need which gives a spacing of . The size of this sample is . Note that scales exponentially in , the dimension of , not in , the dimension of the embedding space.

Figure 5: An illustration of the lower bound technique used in Equation 6. The volume shown in the black dashed lines, is bounded from below by placing a -dimensional ball of radius at each point of , shown in green. In this illustration, a 1-dimensional manifold is embedded in 2 dimensions, so these balls are 1-dimensional line segments.

Recall that is the -tubular neighborhood of . The -balls around , which comprise , cover and so any robust approach that guarantees correct classification within will achieve perfect accuracy on . However, we will show that covers only a vanishingly small fraction of . Let denote the -ball of radius centered at the origin. An upper bound on the volume of is

(5)

Next we bound the volume from below. Intuitively, a lower bound on the volume can be derived by placing a -dimensional ball in the normal space at each point of and integrating the volumes. Figure 4 (Right) illustrates the lower bound argument in the case of .

(6)

Combining Equations 5 and 6 gives an upper bound on the percentage of that is covered by .

(7)

Notice that the factors involving and cancel. Figure 6 (Left) shows that this expression approaches as the codimension of increases.

Suppose we set and construct a -cover of . The number of points necessary to cover with balls of radius depends only on , not the embedding dimension . However the number of points necessary to cover the tubular neighborhood with balls of radius increases depends on both and . In Theorem 6 we derive a lower bound on the number of samples necessary to cover .

Theorem 6.

Let be a bounded -flat as described above, bounded along each axis by . Let denote the number of samples necessary to cover the -tubular neighborhood of with -balls of radius . That is let be the minimum value for which there exists a finite sample of size such that . Then

(8)
Proof.

We first construct an upper bound by generously assuming that the balls centered at the samples are disjoint. That is

(9)

To guarantee that we set the left hand side of Equation 9 equal to and solve for .

The last inequality follows from Equation 6. Setting gives the result. The asymptotic result is similar to the argument in the proof of Theorem 5. ∎

Theorem 6 states that, in general, it takes many fewer samples to accurately model than to model . Figure 6 (Right) compares the number of points necessary to construct a -cover of with the lower bound on the number necessary to cover from Theorem 6. The number of points necessary to cover increases as , scaling polynomially in and exponentially in . In contrast, the number necessary to construct a -cover of remains constant as increases, depending only on .

Our lower bound of samples is similar to the work of Schmidt et al. (2018) who prove that, in the simple Gaussian setting, robustness requires as much as more samples. Their arguments are statistical while ours are geometric.

Figure 6: We plot the upper bound in Equation 7 on the left. As the codimension increases, the percentage of volume of covered by -balls around the -sample approaches . On the right we plot the number of samples necessary to cover , shown in blue, against the number of samples necessary to cover , shown in orange, as the codimension increases.

Approaches that produce robust classifiers by generating adversarial examples in the -balls centered on the training set do not accurately model , and it will take many more samples to do so. If the method behaves arbitrarily outside of the -balls that define , adversarial examples will still exist and it will likely be easy to find them. The reason deep learning has performed so well on a variety of tasks, in spite of the brittleness made apparent by adversarial examples, is because it is much easier to perform well on than it is to perform well on .

7 A Lower Bound on Model Expressiveness

7.1 A Simple Example

Consider the case of two concentric circles with radii respectively, as illustrated in Figure 7

. Each circle represents a different class of data. Suppose that we train a parametric model

with parameters so that for , and for , . How does the number of parameters necessary to ensure that such a decision boundary can be expressed by increase as the gap between and decreases?

Suppose that we first lift and to a parabola in via map . That is, we construct the sets and similarly for . After applying , and are linearly separable for any . The linear decision boundary in maps back to a circle in that separates and . This is not the case for deep networks; the number of parameters necessary to separate and will depend on the gap .

In the important special case where is parameterized by a fully connected deep network with layers,

hidden units per layer, and ReLU activations,

Raghu et al. (2017) prove that subdivides the input space into convex polytopes. In each convex polytope, defines a linear function that agrees on the boundary of the polytope with its neighbors. They showed that, when the inputs are in , the number of polytopes in the subdivision is at most (Raghu et al. (2017)[Theorem 1]).

Let denote the subdivision of space into convex polytopes induced by . Consider the decision boundary of . can be constructed by examining each polytope and solving the linear equation where is the linear function defined on by . Since is linear the solution is either (1) the empty set, (2) a single line segment, or (3) all of . Case (3) is a degenerate case and there are ways to perturb by an infinitesimally small amount such that case (3) never occurs and the classification accuracy is unchanged. Thus we conclude that is a piecewise-linear curve comprised of line segments. (In higher dimensions

is composed of subsets of hyperplanes.) See Figure

7.

Suppose that separates from and let be a line segment of the decision boundary. Since lies in the space between and , the length , which is tight when is tangent to and touches at both of its endpoints. For to separate from , must make a full rotation of around the origin. The portion of this rotation that can contribute is upper bounded by . Thus the number of line segments that comprise is lower bounded by .

As , the minimum number of segment necessary to separate from . Since each polytope can contribute at most one line segment to , the size of the model necessary to represent a decision boundary that separates from also increases as the circles get closer together.

Now consider and under the norm, defined as . Suppose that a fully connected network described as above has sufficiently many parameters to represent a decision boundary that separates from . Is also capable of learning a robust decision boundary that separates from ?

Figure 7: Separating two classes of data sampled from and may require a decision boundary with only a few linear segments. However a decision boundary that is robust to -perturbations must lie in gap between and . Learning a robust decision boundary may require more linear segments and thus a more expressive model. As we increase , demanding a more robust decision boundary, the gap between and decreases, and so the number of linear segments increases towards .

For to separate from it must lie in the region between and . In this setting each segment can contribute at most to the full rotation around the origin. The minimum number of line segments that comprise a robust decision boundary is lower bounded by . As this quantity approaches . Even if is capable of separating from we can choose such that .

This simple example shows that learning decision boundaries that are robust to -adversarial examples may require substantially more powerful models than what is required to learn the original distributions. Furthermore the amount of additional resources necessary is dependent upon the amount of robustness required.

7.2 An Exponential Lower Bound

We present an exponential lower bound on the number of linear regions necessary to represent a decision boundary that is robust to -perturbations of at most , in the simple case of two concentric -spheres.

Theorem 7.

Let be two concentric -spheres with radii respectively and let . Let

be a fully connected neural network with ReLU activations. Suppose that

correctly classifies for some . Said differently, the decision boundary of lies in a -tubular neighborhood of the decision axis, . Then the number of linear regions into which subdivides is lower bounded as

(10)

Written asymptotically,

Proof.

For to be robust to -adversarial examples for the decision boundary . The boundary of is comprised of two disjoint -spheres, which we will denote as and with radii and respectively. (It is standard in topology to use the symbol to denote the boundary of a topological space.)

The isoperimetric inequality states that a -sphere minimizes the -dimensional volume (thought of as “surface area”) across all sets with fixed -dimensional volume (thought of as “volume”). Since , the -dimensional volume enclosed by is at least as large as that of and so we have that .

Now consider any -dimensional linear facet of the decision boundary . The normal space of is -dimensional; let

denote a unit vector orthogonal to

. (There are two possible choices and .) Due to the spherical symmetry of and the fact that , the diameter of is maximized when is tangent to at (or ) and intersects . In pursuit of an upper bound, we will assume without loss of generality that has these properties. Let denote the origin, , and . We consider the right triangle with right angle at . By basic properties of right triangles, . It follows that is contained in a -dimensional ball of radius . In particular the -dimensional volume of is bounded as . The -dimensional volume of (again thought of as “surface area”), is equal to the sum of the -dimensional volumes of the linear facets that comprise . Combining these inequalities gives the result.