Eikonal depth: an optimal control approach to statistical depths

Statistical depths provide a fundamental generalization of quantiles and medians to data in higher dimensions. This paper proposes a new type of globally defined statistical depth, based upon control theory and eikonal equations, which measures the smallest amount of probability density that has to be passed through in a path to points outside the support of the distribution: for example spatial infinity. This depth is easy to interpret and compute, expressively captures multi-modal behavior, and extends naturally to data that is non-Euclidean. We prove various properties of this depth, and provide discussion of computational considerations. In particular, we demonstrate that this notion of depth is robust under an aproximate isometrically constrained adversarial model, a property which is not enjoyed by the Tukey depth. Finally we give some illustrative examples in the context of two-dimensional mixture models and MNIST.

Authors

• 2 publications
• 13 publications
09/06/2019

Generalization of the simplicial depth: no vanishment outside the convex hull of the distribution support

The simplicial depth, like other relevant multivariate statistical data ...
05/10/2019

Illumination depth

The concept of illumination bodies studied in convex geometry is used to...
03/15/2021

Enclosing Depth and other Depth Measures

We study families of depth measures defined by natural sets of axioms. W...
12/15/2021

On Generalization and Computation of Tukey's Depth: Part I

Tukey's depth offers a powerful tool for nonparametric inference and est...
04/24/2021

The GLD-plot: A depth-based plot to investigate unimodality of directional data

A graphical tool for investigating unimodality of hyperspherical data is...
11/26/2018

Multiscale geometric feature extraction for high-dimensional and non-Euclidean data with application

A method for extracting multiscale geometric features from a data cloud ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In univariate statistics, quantiles, depths, and medians serve as an important cornerstone for constructing robust statistical orderings. Generalizing these notions to multivariate statistics has been the subject of significant study. Indeed, constructing multivariate notions of depth which are simultaneously robust, computable, interpretable, and statistically useful remains challenging.

The most classical notion of multivariate depth is the Tukey or Halfspace depth, which, given a distribution over , is defined by

 \definecolor[named]pgfstrokecolorrgb0,0,0\pgfsys@color@gray@stroke0\pgfsys@color@gray@fill0dT(x):=infa∈Rd,a≠0F({y∈Rd|(x−y)⋅a≥0}).

This notion of depth, which is a natural extension of the definition in , enjoys many useful properties. The (super) level sets are convex and nested, are invariant under affine transformations, and are robust under specific classes of changes to . Indeed the Tukey depth is considered so fundamental that its properties have almost directly been used to define what it means to be a statistical depth [56].

However, the halfspace depth has some inconvenient properties. It is challenging to compute in higher dimension. It also is difficult to extend to data which is not inherently Euclidean, and gives counter-intuitive results for distributions that are multimodal or have non-convex level sets. Finally, the notion of robustness used in studying the Tukey depth is very classical, and under other notions of adversarial robustness the Tukey depth actually turns out to be non-robust (see Section 2.2 for details).

The purpose of this work is to propose a new notion of depth, which we call the eikonal depth, designed to address some of these shortcomings. This depth is based upon a different extension of the one-dimensional quantile depth to higher dimension, and has a direct, interpretable formulation based upon optimal control. Like the case of the halfspace depth, our definition is global and geometric in nature. However, unlike the case of the halfspace depth, this depth is much more amenable to computation, is robust under a natural class of adversarial perturbations to inputs, and is directly extensible to non-Euclidean and discrete settings. Indeed, we argue that this notion of depth is natively understood in terms of metric geometry, as opposed to convex geometry: see Section 2.2. Although there are many notions of depth (some of which we describe in Section 1.3), we believe that the eikonal depth provides a unique combination of features not attainable by other curently available notions of depth.

The remainder of the work is organized as follows: in Section 1.1 we give the definition of eikonal depth in th Euclidean setting, and describe the connection with the classical one-dimensional depth. In Section 1.3 we provide a review of related literature. In Section 2.1 we describe some of the main properties of the eikonal depth in Euclidean space and in Section 2.2 we identify a new type of adversarial robustness inherent to our definition. Section 3 describes natural extensions of the definition to a wide class of metric spaces such as manifolds with boundary and graphs. Finally, Section 4.1 describes numerical approaches for this depth, and gives a number of illustrative examples.

1.1 Proposed model: an eikonal depth

We will start by defining the eikonal depth in the population setting and will provide an analogous approach for empirical measures in Section 3. As such, we will a assume that has an associated continuous density function . For now, we will focus our attention on two canonical cases: distributions with probability densities whose support is all of and distributions whose support is closed and bounded. Considering densities with support on all of

allows us to include standard probability distributions such as mixtures of Gaussians, depicted in Figure

0(b)

. An example of a distribution with a density with compact support is a uniform distribution on a square, as in Figure

0(a).

Definition 1.1.

Let be a probability distribution with density and let be a continuous and non decreasing function such that .

We define to be the set of continuous curves of locally finite length from if (alternatively, from if for some open and bounded ) so that if then and

 limt→∞γ(t)=∞ if supp% (ρ)=Rd,alternatively, γ(T)∈∂Ω if supp(ρ)=¯¯¯¯Ω. (1.1)

We then define the -eikonal depth of a point in , denoted by , by the minimization problem

 Deik(x,F):=infγ∈Ux∫γϕ(ρ)ds=:infγ∈UxJ(γ), (1.2)

where stands for the integral of the function along the curve . In particular, in the case where the support of is all of then this integral takes the form , where may either be a classical derivative (when available), or may be interpreted as a measure, which always is possible for curves of locally finite length.

For the sake of clarity, when the role of is not important, we refer to simply as the eikonal depth. When we call the unnormalized eikonal depth. When we call the normalized eikonal depth.

Definition 1.1 has mathematical meaning even if : indeed if then this simply becomes the problem of finding the shortest path to the boundary of the domain. In our context we restrict our attention to the case where so that we may talk about distributions with unbounded support.

In the context of optimal control theory, we can interpret the -eikonal depth as the minimum amount of time that a particle requires to escape to infinity (or to reach the boundary of the support, depending on the support of the density) in a field with velocity equal to or alternatively as the minimal amount of -weighted density along a path to infinity. We note that we could actually replace the infimum with a minimum in the previous definition: if is a sequence that achives the infimum, i.e. , we can take a weak limit and use the continuity of to obtain a with .

We can also see from Definition 1.1 that is continuous if and are continuous. In many settings, the optimal cost in this type of control problem can be characterized as a solution to a Hamilton-Jacobi equation. For example, in the case when has support in all of or has compact support, we may use the following equivalent definition of the -eikonal depth:

Definition 1.2.

Let be a probability distribution in with density with (alternatively with for an open and bounded ), and let be a non decreasing continuous function .

We define the -eikonal depth as the solution, in the viscosity sense (see Definition 1.3), of the equation

 |∇u(x)|=ϕ(ρ(x)) (1.3)

with boundary conditions (alternatively on ).

In the previous definition we refer to the “viscosity solution” of Equation (1.3). The precise definition of viscosity solutions is somewhat technical but a simplified version is given in Definition 1.3; one reference text is [1], and an introduction in the context of Tukey depths is given in [37]

. From a heuristic standpoint, the concept of viscosity solutions

provides a generalized notion of solving (1.3) at points where is not differentiable that is consistent with the control problem. Such a generalization is essential as (1.3) will usually not admit everywhere differentiable solutions. From an algorithmic standpoint, viscosity solutions can be approximated by adding a small viscosity term () to the differential equation.

For the sake of clarity we give the definitions of viscosity solutions of a differential equation only in the context of Equation 1.3. More general definitions for nonlinear differential equations can be found in [14]. We denote the set of continuous functions in by .

Definition 1.3.

A function is a viscosity subsolution of Equation (1.3) if for every and every such that has a local maximum at in we have that

 |∇u(x)|≤ϕ(ρ(x)). (1.4)

In a similar manner, we say that a function is a viscosity supersolution of (1.3) if for every and every such that has a local minimum at in we have that

 |∇u(x)|≥ϕ(ρ(x)). (1.5)

Finally, if is both a subsolution and a supersolution of (1.3) then we call it a viscosity solution of (1.3).

Existence and uniqueness results for the Eikonal equation in a bounded domain are well established [1] and the following comparison principle is a standard result from the theory of viscosity solutions:

Proposition 1.4.

Let be a supersolution of with on . If is a subsolution, then .

For densities whose support is we have a the following comparison principle that assumes the existence of a bounded supersolution of equation (1.3).

Proposition 1.5.

Let be a bounded supersolution of with when . If is a subsolution, then .

For the proof of Proposition 1.5 refer to the appendix. In our case, we can straightforwardly define a bounded supersolution of (1.3) that guarantees that Proposition 1.5 is applicable:

 v(x)=min{∫x1−∞ρ(ξ,x′)dξ,∫∞x1ρ(ξ,x′)dξ}, (1.6)

where stands for the last coordinates of a point . The inequality relating sub- and supersolutions in Propositions 1.4 and 1.5 implies that the explicit supersolution (1.6) provides an upper bound for the depth.

1.2 One dimensional case

It is useful to consider how this definition relates to the classical definition of quantiles, depths, and medians in one dimension. Given a density function on the real line, we may readily define the quantile depth by the formula

 DQ(x)=min(∫x−∞ρ(z)dz,∫∞xρ(z)dz)

We may then be express this equation in many different forms. For example, we may rewrite

 DQ(x)=mina≠0∫a(x−y)≤0ρ(z)dz.

This definition is, naturally, unnecessarily complicated in , as we only need to really check over . However, this definition readily extends to by simply replacing the region of integration by : this corresponds to integrating over halfspaces, and matches the definition of the classical Tukey depth.

On the other hand, in the context of the eikonal depth in one dimension, we notice that

 Deik(x) =minγ∈Ux∫∞0ρ(γ(t))|˙γ(t)|dt. =min(minγ∈Ux,˙γ≥0∫∞0ρ(γ(t))|˙γ(t)|dt,minγ∈Ux,˙γ≤0∫∞0ρ(γ(t))|˙γ(t)|dt) =min(∫x−∞ρ(z)dz,∫∞xρ(z)dz)=DQ(x),

where the second equality holds since we can always decrease the cost by replacing a trajectory with one that is monotone, and the third equality holds by the change of variables formula.

As is the case with the Tukey depth, such a definition is unnecessarily complex, as there are only truly two paths leading to spatial infinity along the real line. But this definition extends directly to higher dimension, motivating Definition 1.1.

We also notice that in one dimension we may directly verify that the quantile depth, which is equal to the eikonal depth, is a viscosity solution of the eikonal equation (1.3), under mild continuity assumptions upon . Indeed, the quantile depth is differentiable at any point except for on the boundary of the set , and by computing the derivative, using the fundamental theorem of calculus we see that it will satisfy the equation in the classical sense at those points, which automatically implies the super- and sub-solution inequalities. On points where is not differentiable, we can directly verify that the super- and sub-solution inequalities are satisfied.

1.3 Related literature

Multivariate medians and depths have been studied within the context of robust statistics for many years: for example the Tukey depth was introduced in the mid 1900’s [27, 26, 52]. Many alternative notions of depth have been proposed. These include, for example, the projection depth [57], the Oja depth [42], the zonoid depth [17], the Mahalanobis depth [35], the convex peeling depth [3], and the Monge Kantorovich depth [12]. These depths each carry unique advantages and disadvantages, which are compared in [39]. In briefest summary, these depths tend to either be i) robust and intepretable, but difficult to compute (e.g. the Tukey, projection, convex peeling), ii) interpretable and computable but not as expressive or robust (e.g. the Mahalanobis depth), or iii) not as easy to interpret (e.g. the zonoid or Monge-Kantorovich depth).

Depth functions have seen application in many different settings. They help to identify inliers and outliers of distributions, which can be an important pre-processing step in many tasks

[57, 28]. They provide an ordering of data, which is useful for certain types of statistical tasks [54, 55]. For many choices of depth these quantities are robust, meaning that they are insensitive to certain types of (potentially adversarial) perturbations [56]

. Depths have also been used for data visualization

[44].

Despite their central place in robust statistics, our understanding of many of their properties remains incomplete. For example, in the last five years there have been many works exploring analytical properties of halfspace depths, including connections to convex geometry [40] and differential equations [37]. Fundamental questions about whether depths characterize their distributions, and whether depth functions are smooth and well approximated by empirical approximations, are still of current interest [36, 41, 40]. Similar questions in the context of other depths have also been the topic of recent work [9, 13]. Some extensions of the halfspace depth to non-Euclidean settings have also been given in [10], [49].

Even defining what is meant by a statistical depth is not immediately obvious. The definition from one prominent work [56] is discussed in more detail in Section 2.4. However, some of those notions are rather rigid, and have been relaxed in other works. For example, the convexity of level sets is relaxed in [12] and subsequent works. Some authors prefer depths that can capture clustering and multi-modality. This often comes by eschewing a globally defined depth for one that is locally defined: this is a sticking point for many authors that prefer to think of depths as globally defined orderings that are distinct from local densities. Our work charts a path between these two points of view, by constructing a depth which is globally defined, but which is flexible enough to capture multi-modal behavior.

Our definition builds naturally upon a very mature literature surrounding optimal control and Hamilton-Jacobi equations, which only very recently have been linked to statistical depths in the context of the halfspace depth in [37] and the convex hull depth in [9]. The eikonal equation, usually with constant right hand side, is the most classical example of a viscosity solution of a Hamilton-Jacobi equation, and references on that topic may be found in [1, 29]. This equation also characterizes distance functions, which are a central topic in Riemannian [16] and, more generally, metric geometry [5]. Indeed, in some ways the depth that we define here may be seen as a modest generalization of the distance function in those contexts, but the interpretation as a statistical depth is, to our knowledge novel.

Numerical solutions for Hamilton-Jacobi equations, and more specifically the eikonal equation, have been well-studied. The fast marching method [46], which marries the dynamic programming principle with upwind numerical schemes, is a standard tool in solving these equations numerically. This method has also been extended to unstructured grids [47] and graphs [15]. We give a very brief treatment of these methods, with a special focus on graphs based upon empirical measures, in Section 4.1.

Finally, there has been a lot of recent activity proving a rigorous connection between graph-based variational methods on graphs and their continuum limits. While many of these works have focused on graph Laplacians and total variation energies and their associated statistical problems (see, e.g., [20, 21, 22, 24]), a few recent works have also made steps in this direction in the context of eikonal equations [8, 18]. Discussion of empirical approximations of the boundary of domains, an issue we encounter in Section 3 can also be found in [8], [53]. These works provide context for the scalings that we chose in Section 4.2 when constructing depths on empirical measures. Finally, while preparing this work we were made aware of another group independently working on a similar class of eikonal equations on graphs [7]. Their work is especially focused on cluster-aware distances, wherein is monotonically decreasing, but includes our framework as a particular case.

2 Properties of eikonal depth

2.1 Main properties

Before describing in more detail the properties of the eikonal depth, it is illustrative to consider a few examples where the depth can be explicitly calculated.

Example 2.1.

Let be a bound, open, convex domain, and let . Then the eikonal depth is given by

 \definecolor[named]pgfstrokecolorrgb0,0,0\pgfsys@color@gray@stroke0\pgfsys@color@gray@fill0Deik(x)=ϕ(1Ld(Ω))d(x,∂Ω),d(x,∂Ω):=infy∈∂Ω|x−y|.

We notice that if then if we replace with then the leading term in the expression scales like . It is straightforward to show that the term will scale like when we rescale , and hence when we use then we observe that, the eikonal depth is scale invariant in this example.

In the case where is given by the unit ball, we may directly compute

 \definecolor[named]pgfstrokecolorrgb0,0,0\pgfsys@color@gray@stroke0\pgfsys@color@gray@fill0Deik(x)=(1−|x|)ϕ(Γ(d/2+1)πd/2),

where is the Gamma function.

We notice that if we let then the power disappears from the , and one can use Sterling’s formula to approximate , which in turn implies that the maximum eikonal depth of a unit ball in is well-approximated by . This implies that for large the maximum eikonal depth of a spherical distribution behaves like , which stands in constrast with the maximum of the Tukey depth of a spherical distribution , which is independent of the dimension .

Example 2.2.

Let

be a multivariate normal distribution with identity covariance. We notice that we may write

, where is a standard one-dimensional gaussian. Then we compute that the depth of is given by

 \definecolor[named]pgfstrokecolorrgb0,0,0\pgfsys@color@gray@stroke0\pgfsys@color@gray@fill0Deik(x)=∫∞|x|ϕ(G(t)d)dt.

When we use , this simplifies to just the tail integral of a one-dimensional Gaussian. In this case the maximum depth is given by independently of .

A well-known property of the Tukey depth is its invariance under affine transformations, a property shared by some other statistical depths such as the Mahalanobis, simplicial, zonoid and convex peeling depths [39]. Although the eikonal depth is not invariant under affine transformations, an affine invariant version of it can be constructed, in the same way as for the Oja, spatial, and lens depth [39]. More details on this procedure are provided after Proposition It is not hard to see that the eikonal depth is not affine invariant: As can be seen from its control interpretation, the value of the depth depends on the length of trajectories. A given affine transformation can modify the length of a curve while keeping the velocity along the curve unchanged, which may result in a change in the minimum traveling time in Definition 1.1. The eikonal depth is, however, invariant under rigid motions:

Proposition 2.3.

For any choice of the eikonal depth is invariant under rigid (i.e. distance preserving) affine transformations.

Proof.

The proof of the previous proposition simply rests on the fact that the cost associated with any control is unchanged by such a rigid affine transformation. ∎

Next, we consider the effect of uniform scalings of upon the eikonal depth. We observe invariance, up to a scaling factor, for transformations of the type for .

Proposition 2.4.

For any , , the eikonal depth is weakly scale invariant in the sense that if , with any and , and is the eikonal depth associated with (according to the function ) then the eikonal depth associated with is given by .

Proof.

Let be a path which begins on and satisfies . Let . By the definition of we have that , where . The cost associated with is given by

 ∫T0˜ρα(˜γ(t))|˙˜γ(t)|dt=∫T0a−αdρα(γ(t))a|˙γ(t)|dt.

Taking an infimum over paths on both sides gives the desired result. ∎

We notice that in the previous proposition that choosing is a critical scaling, in which the depth function is invariant under uniform scalings. This property is important because a result by Serfling [45] guarantees that functions with this invariant property can be made affine invariant using a scatter transform. Following [39], a scatter matrix is a positive definite matrix that satisfies for any of full rank, any , and some . In order to make the depth affine invariant, one needs to transform the data as where is the scatter matrix and is a location parameter. A computational example of this procedure applied to a mixture of two Gaussian distributions is shown in Figure 3, where we start with two Gaussian distributions centered at the origin and with diagonal covariance matrices. Using the covariance matrix of the Gaussian mixture as the scatter matrix , and the mean of the Gaussian mixture, , as the location parameter, we transformed the data as .

The Tukey depth admits a uniform maximal bound independent of dimension, namely . It is natural to consider whether eikonal depths have a similar property. The following proposition provides a first step in that direction.

Proposition 2.5.

Within the class of radially symmetric distributions:

1. For any there exist distributions with arbitrarily large maximal unnormalized eikonal depth.

2. For any there exist distributions with arbitrarily large maximal normalized eikonal depth.

Proof.

First, for the unnormalized depth, we notice that by simply scaling the independent variable by a factor the depth function increases by a factor of (see Proposition 2.4). In turn, we may obtain the first result by simply scaling any radially symmetric function appropriately.

For the second, we consider truncations of the function . In particular, we let

 ρε(x)=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩(1−δ)1−ln(ε)Sd−1|x|−d for ε<|x|<1δ1Vd(ε) otherwise,

where is such that is continuous at . Here we use to denote the surface area of the dimensional sphere and to denote the volume of the dimensional ball of radius . The maximal depth for this distribution is clearly attained at the origin. This value may be computed by integrating, after a suitable rotation, along the line segment described by , where . For the normalized eikonal depth we have

 Deik(x)=∫10ρ1/dε(t)dt≥∫1ερ1/dε(t)=∫1ε1(−ln(ε))1/dS1/dd−1|t|dt=(−ln(ε))1−1/dS1/dd−1. (2.1)

Taking gives the desired result.

The Tukey depth also admits a dimension dependent uniform lower bound on it’s maximal value. In particular, for any distribution on the point of greatest halfspace depth will have depth greater than or equal to . The following example demonstrates that this is not the case for the eikonal depth.

Example 2.6.

For , and for any which satisfies , there exist distributions with arbitrarily small maximum eikonal depth. Let be a smooth probability density compact support on , and let our probability distribution be given by

 ρ(x)=1KK∑i=1φ(x−xi),

where are distinct elements of , namely they are points with distinct integer coordinates. We claim that the maximal Tukey depth of this distribution may be bounded by . This can be proven by constructing a path from spatial to any point which only crosses the supports of one of the , and then bounding the cost of that path due to . More specifically, let be an upper bound on , and consider a point so that (other points can be handled in an analogous way). We then consider the trajectory defined by joining the ray , , with the line segment connecting to . The integrated cost due to the ray out to infinity will be zero, as the all have integer coordinates and has support on . On the other hand, the cost due to the line segment connecting to will have cost at most , as the density is bounded from above by , is non-decreasing, and the length of the line segment to is at most . This proves the claim.

The eikonal depth does not necessarily define a single center, i.e. a point with maximal depth, in contrast to the Tukey depth and other well-studied statistical depths such as the Mahalanobis depth, or the convex peeling depth [39]. In some settings, this is actually a desirable property of the eikonal depth because it enables it to capture the multimodality of probability distributions, a property that is not shared by the aforementioned definitions of depth. In Section 2.4 we present a detailed discussion on the choice one faces between defining a center outward ordering and capturing the multimodality of distributions when defining a statistical depth. For our eikonal depth, we will see that when the distribution has several peaks that are high enough then there is a local maximum of the depth near each peak. Before formalizing this result for multimodal distributions we first need to introduce the following definition for the modes:

Definition 2.7.

We say that a distribution has well-separated modes with respect to if

1. Its density function has a finite number of local maxima (modes), which we write as .

2. There exists a finite, disjoint family of open balls so that each contains exactly one .

3. For all and for any two points there exists a curve which joins them and belongs to so that we have where is an arbitrary curve joining and the mode .

With this definition in hand we can show the next property that provides a way of associating certain classes of modes with local maxima of the eikonal depth.

Proposition 2.8.

Suppose that has well-separated modes with respect to , and let be the balls used to describe the well-separated property of the mode occuring at . Then there is a local maximum of the eikonal depth inside .

Proof.

Let be an optimal trajectory, according to the energy (1.2), terminating at the mode . This path must pass through some point, which we call , on the boundary of . Now let be any point on the boundary of and be the path joining and from Definition 2.7. Then we have by Definition 1.1, the well-separated property and choice of , that

 Deik(^x,F)≤Deik(~x,F)+∫γ~x^xϕ(ρ)ds≤Deik(~x,F)+infγ∈U~xxi∫γϕ(ρ)ds=Deik(xi,F),

where denotes the set of paths starting at and ending at . As is continuous, on the set it must obtain its maximal value. For any we have that , which means that we can always find a maximizer of in the interior of . In turn has a local maximizer in . We note that such a minimizer need not be exactly .

We notice that while directly proving the well-separated property may be challenging, the idea is rather intuitive: if the mode is steep enough then it should be ”cheaper” to go around than to go through. The next example illustrates one setting where such a property can be established.

Example 2.9.

Consider a Gaussian probability distribution in with mean located at the origin and with covariance matrix , where

is the identity matrix and

is a specified standard deviation. The probability density for this distribution is given by . We have that the cost of going around the mode along a circular arc of radius and angle , which is parametrized as with after a suitable rotation, is given by . On the other hand, the cost of going from any point on the circular path to the mode can be calculated, after a suitable rotation, as the integral of along the path with : .

In particular, for , the two costs are and . These two quantities are comparable if , as can be checked numerically. So we expect that, heuristically speaking, once we are two standard deviations away from the mean it is cheaper to go around than to go over the mode.

Now consider a Gaussian mixture model in . If each component of the mixture model has covariance , where is the identity matrix and is a specified standard deviation, then the distribution will have well-separated modes if the contribution of one mixture component is negligible approximately two standard deviations away from the mean of any other mixture component. Thus we expect, again heuristically, that if components are more than four standard deviations apart then each mode should be identified as a local ”center point” by the eikonal depth.

Figure 4 shows a computational example, where we compute the eikonal depth for two mixtures of Gaussian probability distributions with in which the Gaussian distributions are, respectively, four standard deviations apart and two standard deviations apart. In the former case the modes are well-separated, according to Definition 2.7, and hence the depth has two local maxima. In the latter case the Gaussians are close enough together that the modes are not well-separated, and hence we only have one local maximum of the depth.

We now address stability and uniqueness of the eikonal depth. It is desirable for a statistical depth to be uniquely associated to the considered set of probability distributions. In this regard, it is worth pointing out that the Tukey depth completely characterizes the input distribution for finite discrete measures [30, 50] and among the class of rapidly decaying distributions (i.e. those which decay faster than exponential) [31]. In some settings the Tukey depth is known not to characterize the underlying distribution [41]. In contrast, the following results clearly quantify in what sense the eikonal depth and their underlying distributions form a one-to-one correspondence.

Proposition 2.10.

The eikonal depth is continuous in the input distribution. If are supported on the closure of a bounded and open set and satisfy bounds then

 ∥Deik(x,ρ1)−Deik(x,ρ2)∥∞≤ℓ∥ρ1−ρ2∥∞.

where .

Proof.

We start by finding a bound on the maximum distance that a particle can travel starting at a point in . Let us denote by and denote respectively the maximum and minimum velocity allowed by the distributions . From the assumptions we have that and . The longest possible time a particle can travel before reaching is then . If we denote by the longest possible distance a particle can travel from a point in then we have that .

Let be an optimal trajectory associated to the distribution with density , that is, realizes the infimum in Definition 1.1 and for some . We have that

 Deik(x,ρ1) =infγ∈Ux∫γρ1ds=∫b0ρ1(γ∗2(t))|˙γ∗2(t)|dt (2.2) =∫b0[ρ2(γ∗2(t))+(ρ1−ρ2)(γ∗2(t))]|˙γ∗2(t)|dt≤Deik(x,ρ2)+ℓ∥ρ1−ρ2∥∞ (2.3)

Proposition 2.11.

Assuming that is injective, then any probability distribution with a continuous probability density with support on an open and bounded domain or on is uniquely determined by its -eikonal depth.

Proof.

Assume that we have to distinct probability distributions with continuous densities such that . Any viscosity solution to the eikonal equation will satisfy the equation classically at almost every point in its domain [1]. Hence and will have a common point of differentiability at a point where By the injectivity of , this implies that the the gradients of and will not match at and therefore the depths cannot be the same. ∎

Propositions 2.10 and 2.11 have established a type of robustness of the eikonal depth. The following proposition establishes a scenario where the eikonal depth is not robust.

Proposition 2.12.

Let be a continuous density function, and suppose that with . Then by modifying an arbitrarily small amount of density we can modify the eikonal depth at a single point by any desired amount.

Proof.

We can decrease the depth at the point by carving a very thin path to that point. For we can increase the depth at a point arbitrarily by adding mass in a small enough ball centered at . For we truncate the profile , as in the second part of Proposition 2.5. ∎

While the previous proposition does establish a type of non-robustness, we notice that the types of modifications we made in the proof would only modify the depth in small regions. It seems unlikely that small modifications to the distribution would be able to modify the depth in large regions, but we do not pursue proving such a property here.

To wrap up this section, we also give two conceptual properties, related to the optimal control formulation of the eikonal depth, which allow us to bound and to simplify computation of the same. These properties do not have do not have immediate analogs for other depth functions, but we believe they are useful and worth mentioning. First, in many settings it is possible to give upper or lower bounds on probability densities, and it seems natural to try to use those to provide bounds on the associated depth functions. The following proposition gives a comparison principle for the eikonal depths that relates upper and lower bounds for the eikonal depth to upper and lower bounds for the probability distribution. This proposition cannot be directly applied to two probability densities, as one cannot have a global inequality on such densities. However, using the scaling invariance property from Proposition 2.4, in some settings one can use the following proposition to provide upper and lower bounds on eikonal depths.

Proposition 2.13.

The eikonal depth satisfies a comparison principle in the sense that if for all then for all .

Proof.

This is immediate from the control formulation of the eikonal depth, in the sense that if then the cost associated with any will be greater when replacing with in (1.2), and hence the infimums will satisfy the same inequality. ∎

Next we give a basic property of the eikonal depth, which is fundamental for numerical approximation schemes. Heuristically, this property implies that the eikonal depth can be determined using the value of the density at and the value of the depth at neighboring points, which permits us to construct efficient, local, numerical schemes.

Proposition 2.14.

Let and suppose that we know the values of on the boundary of . Then the values of can be determined using only the density of inside .

The proof of this proposition is immediate from the dynamic programming principle: any optimal path beginning at a point in the interior of will also be optimal from the time it leaves onward. Indeed, we already used this idea in the proof of Proposition 2.8. This simple observation also factors prominently in the construction of numerical methods, see Section 4.1.

2.2 Isometric robustness

Proposition 2.15.

Let be an invertible, mapping so that

 \definecolor[named]pgfstrokecolorrgb0,0,0\pgfsys@color@gray@stroke0\pgfsys@color@gray@fill0∥DΦ−I∥∞≤ε.

Let , that is . Then

 (1−ε)Deik(x,ρ)≤Deik(Φ(x),~ρ)≤(1+ε)Deik(x,ρ), for all x∈Rd (2.4)
Proof.

Consider a path satisfying and , and define a second path . We may readily compute

 ∫∞0ϕ(~ρ(γ2(t)))|˙γ2(t)|dt (2.5) =∫∞0ϕ(~ρ(Φ(γ1(t)))|DΦ(γ1(t))˙γ1(t)|dt (2.6) =∫∞0ϕ(ρ(γ1(t)))|DΦ(γ1(t))˙γ1(t)|dt, (2.7)

where the last line follows because of the definition of and . By using the assumption on , we immediately have that , which readily implies that

 (1−ε)∫∞0ϕ(ρ(γ1(s)))|˙γ1(s)|ds≤∫∞0ϕ(~ρ(γ2(t)))|˙γ2(t)|dt≤(1+ε)∫∞0ϕ(ρ(γ1(s)))|˙γ1(s)|ds.

By taking the infimum over paths we then obtain that, for all ,

 (1−ε)Deik(x,ρ)≤Deik(Φ(x),~ρ)≤(1+ε)Deik(x,ρ),

as desired. ∎

The fact that the eikonal depth is isometrically robust is a consequence of the inherent metric nature of its definition. Our viewpoint is that the eikonal depth is a measure of centrality or outliers in metric geometry in the same sense that the Tukey depth is a notion of centrality for convex geometry. Indeed, the very definitions, which are, respectively, based upon path length versus supporting halfspaces, are direct consequences of the underlying geometric viewpoint.

The notion of isometric robustness that we propose is quite different from the notion of breakdown point used in classical robust statistics. However, it is, in spirit, much closer to the study of distributionally robust optimization problems [11], or more generally adversarial training problems in statistics. In those contexts, one typically solves an optimization problem of the form

 minθ∈Θsupd(μ,~μ)≤εE~μR(θ).

Here is an underlying data distribution, for example associated with a classification or regression problem, and represents a risk function which depends upon the measure of the data points. The metric represents the geometry of permissible data perturbations by a hypothetical adversary, and , called the adversarial budget, represents the power that adversary has to corrupt the data. In many common adversarial learning problems the metric is some type of Wasserstein metric (see e.g. [23]), which permits adversaries to move many data points smaller distances. These adversaries are in one sense much weaker than the adversary supposed in studying the breakdown point: they typically cannot move points very far. But on the other hand they may move many points short distances, which makes these adversaries in another

sense stronger than the one associated with the breakdown point. Utilizing this type of adversary has been incredibly effective in improving generalization for statistical algorithms in the context of deep learning

[25], and has sparked significant algorithmic and theoretical work [4, 23, 34, 38, 43].

One can view the isometric robustness in Proposition 2.15

as following in the spirit of DRO problems, in that we are imagining an adversary that is given a limited adversarial budget expressed in terms of how much they can skew distances between points. Such adversaries are neither strictly weaker or stronger than the one imagined in studying the breakdown point, but are probably closer in practice to the robustness studied in contemporary DRO and adversarial learning problems

based upon Wasserstein distances. It is natural to consider whether classical depths, such as the Tukey depth, are robust in the sense of Proposition 2.15. The following example shows that this is not the case.

Example 2.16.

Suppose that is given by a uniform distribution on two balls in of radius at the points . At the origin it is straightforward to compute that the Tukey depth will be . However, consider a mapping which leaves the two balls invariant, but moves the origin to . Such a mapping can be constructed so that . However, the Tukey depth of the new point will be . This demonstrates that the Tukey depth is not necessarily isometrically robust.

The previous example clearly shows that the Tukey depth is not robust with respect to approximate isometries. This should not be surprising, as the Tukey depth is really built upon definitions from convex geometry, as opposed to metric geometry.

2.3 Summary of properties

Thus far we have proved quite a few different properties of the eikonal depth. Along the way we have also provided discussion and comparison with the halfspace/Tukey depth. For convenience, here we provide a brief summary of these properties.

• Scaling: The eikonal depth is invariant under rigid transformations (Proposition 2.3), and is weakly scale invariant when (Proposition 2.4

). It is not invariant under general linear transformations.

• Bounds: The eikonal depth does not admit direct upper or lower bounds of their maxima (Proposition 2.5 and Example 2.6). The example for large maximal values required densities which are unbounded from above. The example for small maximal values required that the density have large gradient. It does admit a convenient comparison principle, which in some cases can aid in bounding the depth (Proposition 2.13).

• Multi-modality: The eikonal depth has local maxima near prominent modes of the density (Proposition 2.8. Its level sets are not necessarily convex.

• Robustness: The eikonal depth uniquely determines the underlying density (Proposition 2.11). It admits a quantifiable stability in terms of the underlying density (Proposition 2.10. It also possess a robustness to nearly isometric distortions (Proposition 2.15), which other classical depths do not enjoy. The eikonal depth is not robust in the sense of breakdown point (Proposition 2.12).

• One-dimensional equivalence: The eikonal depth is equivalent to the quantile depth in one-dimension (Section 1.2).

2.4 What properties should a statistical depth have?

In the one-dimensional setting, there are natural ways to define center-outward orderings of data, as described via medians and quantile depths in Section 1.2. However, in higher dimensions there is no single canonical center-outward ordering associated with data points. This has led to myriad definitions of depths in higher dimension, as discussed in Section 1.3. This multitude of definitions naturally leads to the question of what it even means to be a statistical depth.

Several classical works have attempted to address this question. A set of desirable properties for a statistical depth satisfied by many commonly used statistical depths, such as the Mahalanobis depth, the Tukey depth, and the simplicial depth is given by Liu-Zuo-Serfling [56]. Serfling and Zuo consider that the defining properties of statistical depth are affine invariance, maximality at center, monotonicity with respect to central point, and vanishing at infinity. In their definition there exists a unique central point with maximal depth, the center, and the depth is decreasing along any ray emanating from that point. These characteristics define a center-outward ordering using the nested level sets of the depth function. Indeed, some works treat their definition as the de-facto definition of what it means to be a statistical depth.

As Serfling and Zuo point out [56], when defining a statistical depth one needs to make a choice between having a center-outward ordering and the ability to capture multimodality. This is a direct consequence of the monotonicity property, which precludes the existence of several central points. Liu proposed a model of depth that prioritizes multimodality in [32]. Freiman and Meloche [19]

proposed a depth in this same direction, the likelihood depth, that is based on estimations of multivariate kernel densities. More recently, Lok and Lee

[33] have proposed a depth focused on multimodality that is based on quantiles of interpoint distances.

The ability to deal with multimodality in the notion of a depth is a necessity for some applications: for instance if the data is well-clustered and the ordering associated with a depth is meant to preserve those clusters [32]. Even more generally, most classical depth functions produce nested orderings of convex sets, which may not be desirable for distributions with non-convex support (see e.g. the discussion in [12]).

The depth that we have constructed has properties which are distinct from many classical depth functions. Indeed, it fails to satisfy many of the points in the definition given in [56]. However, it flexibly captures multi-modal and clustered data, still satisfies natural notions of stability and robustness, is easily computable, and is easy to interpret. Even in the Euclidean setting, we posit that this depth provides a competitive alternative to many of the classical notions of depth. We shall also see, in subsequent sections, that this depth extends in very natural ways to non-Euclidean settings, further highlighting its versatility and potential applicability.

3 Extension to non-Euclidean settings

In settings where data is non-Euclidean, the question of how to appropriately define depths and medians becomes even less clear. Although there have been some attempts to generalize the halfspace depth to this setting (see e.g. [10, 49]), in general it is not clear how to best approach this problem. We shall see in this section that the definition of the eikonal depth extends very naturally to a wide class of metric settings, while still remaining interpretable and computable.

To begin, we recall our original definition of the eikonal depth:

 Deik(x,F):=infγ∈Ux∫γρds=infγ∈Ux∫∞0\definecolor[named]pgfstrokecolorrgb0,0,0\pgfsys@color@gray@stroke0\pgfsys@color@gray@fill0γ(t)|˙γ(t)|dt,

This definition, as it turns out, only requires a few basic ingredients, which are present in most metric settings. In particular, this definition extends very naturally as long we can have a way of defining the following:

1. A notion continuous curves, and speed of travel along those curves

2. A consistent notion of probability density along those curves

3. An appropriate boundary condition, namely that our curves approach “spatial infinity” as .

Each of these requirements are immediately fulfilled in many non-Euclidean settings. In particular, we now give informal examples of different settings where this definition may be directly extended. In doing so we focus on conceptual ideas, at the cost of perhaps some precision and rigor.

Example 3.1 (Manifolds with boundary).

Suppose that is a -dimensional Riemannian manifold with non-empty boundary. On such a manifold, by using our Riemannian metric we can define a notion of curves, namely a mapping which is differentiable so that . We then suppose that we have a continuous density function . Finally, we let the boundary of our manifold define a the “boundary condition”, namely we require that . Let us define to be the set of such curves which satisfy this boundary condition, and begin at at time . Under such assumptions, we can define a depth function using

 Deik(x,ρ):=infγ∈Ux∫∞0ρ(γ(t))|˙γ(t)|dt

We notice that such a definition is actually identical to the Euclidean one, except that we have had to modify the class , to encode the geometry of the manifold through the boundary condition. We also notice that this framework could be translated into computing the distance of the point to the boundary under a new metric , where is the original metric for .

Example 3.2 (Unions of Riemannian manifolds).

We consider a finite collection of Riemannian manifolds with boundary, of possibly unequal dimension. We suppose that these manifolds admit some points of intersection: a canonical example would be when these manifolds are smooth surfaces embedded in Euclidean space, with non-trivial intersection. We then call a mapping a curve if there exists a partition of into closed intervals , and so that for each sub-interval is a curve of on one of the manifolds . Again, we can define the boundary condition using the boundary of the manifolds, in this case we let it be . We then use exactly the same definition for the eikonal depth.

Example 3.3 (Geodesic metric spaces).

Let us consider a geodesic metric space, which heuristically allows us to define distances between points using the length of continuous paths connecting those points. More specifically, we may define the length of a curve between two times via

 ℓ(γ,t1,t2):=sup(si)k+1i=1∈T[t1,t2]k∑i=1d(γ(si),γ(si+1)).

Here is the set of all finite partitions of , and the are the boundaries of those partitions. A geodesic metric space is simply a space so that, if is the set of all continuous paths starting at at time and ending at at time , then .

Geodesic metric spaces generalize Riemannian manifolds in that they do not require local Euclidean structure for length. Classical examples include Banach, sub-Riemannian, and Finsler spaces. One introductory text on these types of spaces is [5]. This notion, which is very general, does require the existence of continuous curves connecting points, and hence precludes discrete metric structure: we address this direction in our next example.

We choose to simply let denote our “boundary”: other settings could also be considered. We define to be the set of continuous functions such that .

Given such a metric space, and an associated probability measure on , associated with a continuous density , we then define

 Deik(x,ρ)=