# Localization from Incomplete Euclidean Distance Matrix: Performance Analysis for the SVD-MDS Approach

Localizing a cloud of points from noisy measurements of a subset of pairwise distances has applications in various areas, such as sensor network localization and reconstruction of protein conformations from NMR measurements. In [1], Drineas et al. proposed a natural two-stage approach, named SVD-MDS, for this purpose. This approach consists of a low-rank matrix completion algorithm, named SVD-Reconstruct, to estimate random missing distances, and the classic multidimensional scaling (MDS) method to estimate the positions of nodes. In this paper, we present a detailed analysis for this method. More specifically, we first establish error bounds for Euclidean distance matrix (EDM) completion in both expectation and tail forms. Utilizing these results, we then derive the error bound for the recovered positions of nodes. In order to assess the performance of SVD-Reconstruct, we present the minimax lower bound of the zero-diagonal, symmetric, low-rank matrix completion problem by Fano's method. This result reveals that when the noise level is low, the SVD-Reconstruct approach for Euclidean distance matrix completion is suboptimal in the minimax sense; when the noise level is high, SVD-Reconstruct can achieve the optimal rate up to a constant factor.

## Authors

• 56 publications
• 13 publications
• 4 publications
11/10/2020

### Inverse Kinematics as Low-Rank Euclidean Distance Matrix Completion

The majority of inverse kinematics (IK) algorithms search for solutions ...
03/16/2022

Low-rank matrix completion has been studied extensively under various ty...
04/12/2018

### Exact Reconstruction of Euclidean Distance Geometry Problem Using Low-rank Matrix Completion

The Euclidean distance geometry problem arises in a wide variety of appl...
02/22/2017

### On the Power of Truncated SVD for General High-rank Matrix Estimation Problems

We show that given an estimate A that is close to a general high-rank po...
12/28/2017

### Network Topology Mapping from Partial Virtual Coordinates and Graph Geodesics

For many important network types (e.g., sensor networks in complex harsh...
06/22/2014

### Convex Optimization Learning of Faithful Euclidean Distance Representations in Nonlinear Dimensionality Reduction

Classical multidimensional scaling only works well when the noisy distan...
05/12/2020

### Detection thresholds in very sparse matrix completion

Let A be a rectangular matrix of size m× n and A_1 be the random matrix ...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

In many signal processing applications we work with distances because they are easy to measure. In sensor network localization, for example, each sensor simultaneously acts as a transmitter and receiver. It receives the signal sent by other sensors while emitting a signal to its surroundings. The useful information we can extract is the time-of-arrival (TOA) or received-signal-strength (RSS) between pairs of sensors, either of which can be seen as a metric of Euclidean distance [2]. Another example is the protein conformation problem. It has been shown by the crystallography community that after sequence-specific nuclear magnetic resonance (NMR) assignments, we can extract the information about the intramolecular distances from two-dimensional nuclear Overhauser enhancement spectroscopy (NOESY) [3]. Other examples include geometry reconstruction of a room from echoes [4], manifold learning utilizing distances [5], and so on.

If all the distances between pairs of nodes are available, then we can use the classic MDS algorithm [6] to recover the coordinates of nodes. It has been proved that if all distances are measured without any error, MDS finds the configuration of nodes exactly. Moreover, MDS tolerates errors gracefully in practice, as a complete EDM overdetermines the true solution. Here, it is worth noting that we cannot recover the absolute coordinates, since rigid transformation (including rotations, translations, reflections, and their combination) does not change the EDM.

However, in many practical applications, it seems impossible to know all the entries of the EDM. In sensor network localization, for instance, due to the limit of transmission power, a sensor can only receive the signal emitted by sensors that are not too far from it. In addition, in most cases, the sensor has limited precision, which results in measurement error in distances. Thus, the measured EDM may be incomplete and noisy. In protein conformation problem, the matter is worse, because NMR spectroscopy only gives the inaccurate distances between nearby atoms. This leads to a highly incomplete EDM with noise. Therefore, it is desirable to develop methods which can localize a cloud of points from an incomplete and noisy EDM.

Generally speaking, it is a difficult task to infer missing entries of an arbitrary matrix. However, some fundamental properties of the EDM make the search for solutions feasible. It was shown in [7] that the rank of an EDM is at most , where denotes the dimension of the space in which nodes live. In other words, in most applications including sensor network localization and protein conformation problem, the rank of the EDM is at most five, as both sensors and atoms are in a three-dimensional space, although we may have thousands of sensors or atoms. By this remarkable rank property, we can solve the EDM completion problem via low-rank matrix completion approaches. Another important property of the EDM given in [8] states that a necessary condition for a matrix is an EDM is that is positive semidefinite (PSD), where

 J:=I−1n11T (1)

is the geometric centering matrix, denotes the number of sensors,

stands for the column vector with all ones, and

is the identity matrix. The PSD property opens another way to solve the EDM completion problem. Most approaches in the literature utilize at least one of these two properties to localize the positions of nodes.

There have been a number of approaches proposed to determine the coordinates of nodes from incomplete EDMs in the past several decades. These methods can be roughly put into three groups based on their core ideas. The first group mainly exploits the rank property of the EDM. This group consists of algorithms that try first to estimate the missing distances by utilizing the rank property of the EDM and then use the classic MDS to find the positions from the reconstructed distance matrix. SVD-MDS [1] and OptSpace-MDS [9] are two examples of this class where SVD-Reconstruct [1] and OptSpace [10] are employed for EDM completion, respectively. The algorithms in the second group formulate the localization problem as a non-convex optimization problem and then employ different relaxation schemes to solve it. An example of this type is relaxation to a semidefinite programming [11, 12, 13, 14]. The last group is known as metric MDS [15, 16, 17]. The algorithms in this group do not try to complete the observed EDM, but directly estimate the coordinates of nodes from the incomplete EDM. Thus most efforts are paid to search for suitable cost functions and fast optimization algorithms.

Among the above mentioned methods, SVD-MDS is shown to be simple and (to some extent) effective [1]. However, theoretical understanding of this method is far from satisfactory. In this paper, we lay a solid theoretical foundation for this approach. More precisely, we establish error bounds for the recovered EDM by SVD-Reconstruct in both expectation and tail forms. Based on these results, the error bound for recovered coordinates is derived. To show the optimality of the SVD-Reconstruct approach for EDM completion, we deduce the minimax lower bound of the zero-diagonal symmetric low-rank matrix completion problem using Fano’s method [18], and it reveals that when the noise level is low, SVD-Reconstruct is minimax suboptimal; when the noise level is high, SVD-Reconstruct can achieve the optimal rate up to a constant factor.

The remainder of the paper is organized as follows. In Section II, we formulate the problem and introduce the SVD-MDS approach. Section III is devoted to giving a performance analysis for the SVD-MDS method. Section IV presents the minimax lower bound of the zero-diagonal symmetric low-rank matrix completion problem. In Section V, we conclude the paper.

## Ii Localization via SVD-MDS

In this section, we formulate the localization problem and introduce the SVD-MDS approach.

### Ii-a Problem Formulation

Given a set of nodes , the EDM of ’s, denoted by , is defined as

 Dij=∥xi−xj∥22,i,j=1,…,n.

To formalize the process of sampling the entries of

, we suppose that there is a non-zero probability

that the distance between nodes and is measured. For simplicity, we let all ’s equal to a constant . The observations are given as the matrix whose entries are

 Yij={Dij+Eij% with probability p,?with probability 1−p,

where the means that the element is unknown and

’s capture the effect of measurement errors, which are commonly assumed to be independent Gaussian random variables with mean zero and variance

. Putting this in matrix form yields

 Y=Ω⊙(D+E), (2)

where denotes the Hadamard product (i.e., point-wise matrix multiplication), is a symmetric mask matrix whose entries on or above the diagonal are independent Bernoulli random variables with parameter , i.e.,

 Ωij={1,with probability p0,with probability 1−pwhen i≤j,

and is a symmetric noise matrix whose entries on or above the diagonal are independent Gaussian random variables, i.e., when . The goal is to localize the cloud of points from .

### Ii-B SVD-MDS Approach

The SVD-MDS approach is a two-stage method for localization. In the first stage, it uses SVD-Reconstruct to complete the EDM. This is done as follows. SVD-Reconstruct first constructs an unbiased estimator

of with entries

 Sij=⎧⎨⎩Dij+Eij−γij(1−p)pwith probability p,γijwith probability 1−p,

where stands for the “best guess” for the unknown square distances . Here, we always assume that .

The next step of SVD-Reconstruct is to obtain the best rank- approximation to ( is the rank of and is at most

). This can be done by taking the singular value decomposition (SVD) and keeping the largest

singular values and corresponding singular vectors. The original SVD-Reconstruct approach simply returns as an approximation of the true . In order to use the classic MDS for localization, we take a symmetrized version of as an estimate of , i.e.,

 ^D=12(~D+~DT).

In the second stage, the classic MDS is employed to localize the nodes. The process is as follows. We first compute , where is defined in (1), and then take SVD to . Note that both and are symmetric, thus . The classic MDS simply returns as the estimated coordinate matrix, where contains singular vectors corresponding to the largest singular values and is the diagonal matrix with the largest singular values in the diagonal.

The SVD-MDS approach is summarized in Algorithm 1.

## Iii Performance Analysis

In this section, we present a detailed analysis for the SVD-MDS approach. We first establish the error bounds for EDM completion by SVD-Reconstruct in both expectation and tail forms, and then derive the error bound for coordinate recovery by MDS.

### Iii-a Expectation Error Bound for EDM Completion

In this subsection, we state the expectation version of EDM completion error via SVD-Reconstruct. Before introducing our main result, we require the following incoherence condition [19]:

 Dij≤ζfor any i,j≤n, (3)

which means that each entry of is bounded by . Our result shows that if the expected number of observed entries is large enough and the incoherence condition (3) is satisfied, then the average error per entry by SVD-Reconstruct can be made arbitrarily smaller than .

###### Theorem 1 (Expectation form).

Consider the model described in (2). Let denote the expected number of observed entries, i.e., . If and the incoherence condition (3) is satisfied, then

 1nE∥^D−D∥F≤C√rnm(ζ+ν), (4)

where is an absolute constant111We use and to denote generic absolute constants, whose value may change from line to line. and denotes the Frobenius norm of .

###### Remark 1.

Note that the left side of (4) measures the average error per entry of :

 1n∥^D−D∥F=(1n2n∑i=1n∑j=1|^Dij−Dij|2)1/2.

Thus, if is large enough, then the average error per entry can be made arbitrarily smaller than .

###### Remark 2.

When the noise level is high, i.e., , the error bound becomes

 1nE∥^D−D∥F≤C√rnmν.

This bound, as we will see later (Theorem 4), is minimax optimal up to a constant factor. However, when the noise level is low, i.e., , we have

 1nE∥^D−D∥F≤C√rnmζ,

which implies that the bound is minimax suboptimal in this case.

###### Remark 3.

Our proof of Theorem 1 is motivated by [19], where the authors presented a performance analysis for (2) under the assumption that both and have independent and identical distributed (i.i.d.) entries. However, in the localization problem (2), both and are symmetric random matrices, which makes the analysis more difficult. Although our result (Theorem 1) has the similar form as that in [19], the constant may be different.

The proof of Theorem 1

makes use of some results from random matrix theory. For convenience, we include them in Appendix

A.

###### Proof of Theorem 1.

Note first that the error of can be bounded by the error of . Indeed,

 E∥∥^D−D∥∥F =E∥∥12(~D+~DT)−D∥∥F =12E∥∥(~D−D)+(~DT−DT)∥∥F ≤E∥∥~D−D∥∥F.

The first equality comes from the definition of . The second equality is based on the fact that is a symmetric matrix. The next inequality follows from the triangle inequality. The last inequality uses the fact that the Frobenius norm of any matrix equals to the Frobenius norm of its transpose. Since both and have rank , has rank at most . Then we have

 ∥~D−D∥F≤√2r∥~D−D∥,

where denotes the spectral norm of , i.e., the largest singular value of . It follows from the triangle inequality that

 ∥~D−D∥ ≤∥~D−p−1Ω⊙(D+E)∥+∥p−1Ω⊙(D+E)−D∥ ≤2p−1∥Ω⊙(D+E)−pD∥ ≤2p−1∥Ω⊙D−pD∥+2p−1∥Ω⊙E∥.

The second inequality holds because is the best rank- approximation to . Therefore, it suffices to bound and .

To bound , let , where is the upper-triangle matrix containing the entries of on or above the diagonal and is the lower-triangle matrix containing the entries of below the diagonal. Let be a random matrix (independent of and ) with i.i.d. entries satisfying the following distribution: and . Similar to and , we also define , , and . Then, can be bounded as follows:

 E∥∥Ω⊙D−pD∥∥ =E∥∥(Ωu⊙D−pDu)+(Ωl⊙D−pDl)∥∥ ≤E∥∥Ωu⊙D−pDu∥∥+E∥∥Ωl⊙D−pDl∥∥ =E∥∥Ω′u⊙D−pDu∥∥+E∥∥Ω′l⊙D−pDl∥∥ =E∥∥(Ω′u⊙D−pDu)+E(Ω′l⊙D−pDl)∥∥+ E∥∥(Ω′l⊙D−pDl)+E(Ω′u⊙D−pDu)∥∥ ≤E∥∥(Ω′u⊙D−pDu)+(Ω′l⊙D−pDl)∥∥+ E∥∥(Ω′l⊙D−pDl)+(Ω′u⊙D−pDu)∥∥ =2E∥∥Ω′⊙D−pD∥∥.

The first inequality is a consequence of the triangle inequality. The second equality holds because and share the same distribution with and , respectively, so the expectations are equal. The second inequality results from Jensen’s inequality. Since now the mask matrix has i.i.d. random entries, we can bound using standard tools from random matrix theory. We proceed by using symmetrization technique (Lemma 1) first and then contraction principle (Lemma 2):

where is the matrix whose entries are independent symmetric Bernoulli random variables (i.e., or with probability 1/2). It is clear that the random matrix has i.i.d. entries. Therefore, can be bounded by Seginer’s theorem (Lemma 3):

 E∥∥R⊙Ω′∥∥≤C1(Emaxi∥(R⊙Ω′)i⋅∥2+Emaxj∥(R⊙Ω′)⋅j∥2),

where and denote the -th row and the -th column of respectively and is an absolute constant.

It is not hard to see that

follows the binomial distribution with parameter

. By Jensen’s inequality and Lemma 4, we have

 Emax1≤i≤n∥(R⊙Ω)′i⋅∥2≤(Emax1≤i≤n∥(R⊙Ω)′i⋅∥22)1/2≤c√np.

Similarly,

 Emax1≤j≤n∥(R⊙Ω)′⋅j∥2≤c√np.

Therefore,

 E∥∥R⊙Ω′∥∥≤C√np,

where is an absolute constant. Thus, the first term can be bounded by

 E∥Ω⊙D−pD∥≤Cζ√np,

where is a constant number.

The second term can be bounded similarly. Define , , , , and in the same way as , , , , and . Then we have

 E∥Ω⊙E∥≤E∥Ω′u⊙E′u∥+E∥Ω′l⊙E′l∥≤2E∥Ω′⊙E′∥.

The first inequality comes from the triangle inequality, and the second from Jensen’s inequality. Since has i.i.d. entries, Seginer’s theorem gives

 E∥∥Ω′⊙E′∥∥≤C1(Emaxi∥(Ω′⊙E′)i⋅∥2+Emaxj∥(Ω′⊙E′)⋅j∥2).

Here, can be bounded by the second part of Lemma 4:

 Emaxi∥(Ω′⊙E′)i⋅∥2≤C′√np.

The same argument holds for . Thus, we conclude that

 E∥Ω⊙E∥≤C′√np.

Putting all these together, we obtain the desired result:

 1nE∥^D−D∥F≤C√rnm(ζ+ν).

### Iii-B Tail Bound for EDM Completion

In this subsection, we derive the tail bound for the EDM completion error, which shows that the error probability decreases fast as the error increases.

###### Theorem 2 (Tail Form).

Consider the model described in (2). If the incoherence condition (3) is satisfied, then for any ,

 P{∥^D−D∥F≥t}≤nexp[−c⋅min(mt2n3r(ζ+ν)2,mtn2√r(ζ+ν))], (5)

where is an absolute constant.

###### Proof.

We first proceed similarly as in the proof of Theorem 1:

 ∥∥^D−D∥∥F

So it suffices to bound the tail of . Let , and let denote the entry of on the -th row and -th column. We define the following matrices using : When ,

 Zij=Aij(eieTj+ejeTi),

and when ,

 Zii=AijeieTi,

where is the standard basis vector with a one in position and zeros elsewhere. Clearly, is a sequence of centered, independent, self-adjoint random matrices. Moreover, for , when , we have

 EZkij =EAkij(eieTj+ejeTi)k ⪯|EAkij|⋅(2eieTi+2ejeTj+eieTj+ejeTi) ⪯E[p⋅∣∣(1−p)Dij+Eij∣∣k+(1−p)⋅∣∣−pDij∣∣k] ⋅(2eieTi+2ejeTj+eieTj+ejeTi) ⪯p⋅E[∣∣(1−p)Dij+Eij∣∣k+∣∣Dij∣∣k] ⋅(2eieTi+2ejeTj+eieTj+ejeTi),

where means that is positive semidefinite. The first inequality holds because has periodicity for , and we can verify the inequality by a direct calculation. The second inequality results from Jensen’s inequality and the fact that the matrix is positive semidefinite. The third inequality again uses the fact that the matrix is positive semidefinite. Note that is a normal variable , and hence it is sub-exponential, with its norm bounded by Then the norm of can be bounded by the triangle inequality:

 ∥(1−p)Dij+Eij∥ψ1≤∥(1−p)Dij∥ψ1+∥Eij∥ψ1≤c2(ζ+ν).

Now, we can calculate the moment of

using the Integral identity (Lemma 5) and the tail property of sub-exponential random variables:

 E|(1−p)Dij+Eij|k =∫∞0P{|(1−p)Dij+Eij|k≥u}du =∫∞0P{|(1−p)Dij+Eij|≥t}ktk−1dt ≤∫∞0exp(1−t/K1)ktk−1dt ≤∫∞0exp(1−s)kKk1sk−1ds =ekKk1Γ(k) =eKk1k!,

where and is an absolute constant. It follows that when we have

 EZkij ⪯p(eKk1k!+ζk)(2eieTi+2ejeTj+eieTj+ejeTi) ⪯ep[c4(ζ+ν)]kk!(2eieTi+2ejeTj+eieTj+ejeTi).

where is an absolute constant. Similarly, when , we have

 EZkij ⪯ep[c4(ζ+ν)]kk!(eieTi)k ⪯ep[c4(ζ+ν)]kk!(2eieTi+2ejeTj+eieTj+ejeTi).

Putting them together, we obtain that when ,

 EZkij⪯ep[c4(ζ+ν)]kk!(2eieTi+2ejeTj+eieTj+ejeTi).

The above inequality implies that the sequence satisfies the condition of Matrix Bernstein inequality (Lemma 15). Moreover, we have

 R=c4(ζ+ν),

and

 σ2 =∥∥∑i≤j2ep[c4(ζ+ν)]2(2eieTi+2ejeTj+eieTj+ejeTi)∥∥ =2ep[c4(ζ+ν)]2∥∥∑i≤j(2eieTi+2ejeTj+eieTj+ejeTi)∥∥ ≤2ep[c4(ζ+ν)]2∥∥∑1≤i,j≤n(2eieTi+2ejeTj+eieTj+ejeTi)∥∥ ≤2ep[c4(ζ+ν)]2⋅6n =c5np(ζ+ν)2.

Thus, applying the matrix Bernstein’s inequality (Lemma 15) to we obtain that for any ,

 P{∥∥∑i≤jZij∥∥≥u}≤2nexp[−c⋅min(u2np(ζ+ν)2,uζ+ν)].

Letting , we have

 P{2√2rp−1∥∥∑i≤jZij∥∥≤t} ≥1−2nexp[−c⋅min(pt2rn(ζ+ν)2,pt√r(ζ+ν))].

Recalling that , we have

 P{∥^D−D∥F≥t} ≤P{2√2rp−1∥∥∑ijZij∥∥≥t} ≤2nexp[−c⋅min(pt2rn(ζ+ν)2,pt√r(ζ+ν))] =2nexp[−c⋅min(mt2rn3(ζ+ν)2,mtn2√r(ζ+ν))].

The last equality comes from the definition of . This completes the proof. ∎

### Iii-C Error Bound for Gram Matrices after MDS

Once we have established the error bounds for EDM completion, it is convenient to utilize them to derive a bound for the coordinate recovery error. We adopt the following metric to measure the reconstruction error [20]:

 dist(X,^X)=1n∥JXTXJ−J^XT^XJ∥F, (6)

where is the true coordinate matrix having each sensor coordinate as a column, is the estimated coordinate matrix having each estimated sensor coordinate as a column, and is the geometric centering matrix defined in (1). Notice that the distances have lost some information such as orientation, since rigid transformation (rotation, reflection and translation) does not change the pairwise distances. We choose (6) as the metric of recovery error because it has the following property: (a) It is invariant under rigid transformation; (b) implies that and is equivalent up to an unknown rigid transformation. Then we have following result.

###### Theorem 3 (Coordinate recovery error).

Let be the estimated location matrix by SVD-MDS. Then the reconstruction error has the following upper bound:

 E dist(X,^X)≤C√dnm(ζ+ν), (7)

where denotes the dimension of the space in which nodes live and is a constant number same as Theorem 1.

###### Remark 4.

We mention that our Theorems 13 also hold when we the noise is sub-Gaussian, i.e., the entries on or above the diagonal of are independent, identical distributed sub-Gaussian random variables with sub-Gaussian norm . The proof techniques are essentially the same.

###### Proof.

The proof of Theorem 3 is motivated by [20]. By definition of , we have

 dist(X,^X) =∥∥J(XTX−^XT^X)J∥∥F ≤√2d∥∥J(XTX−^XT^X)J∥∥,

where we have used the fact that the matrix has rank at most . Let . The spectral norm can be bounded by the triangle inequality as follows:

 ∥∥J(XTX−^XT^X)J∥∥≤∥∥JXTXJ−M∥∥+∥∥M−J^XT^XJ∥∥.

Recall that the EDM has the following property [20]:

 −12JDJ=JXTXJ.

Thus, the first term can be written as

 ∥∥JXTXJ−M∥∥=∥∥−12JDJ+12J^DJ∥∥=12∥∥J(^D−D)J∥∥.

By submultiplicity of spectral norm and the fact that , the right-hand side is bounded by

 12∥∥J(^D−D)J∥∥≤12∥J∥∥^D−D∥∥J∥=12∥^D−D∥.

To bound the second term, note that is the best rank- approximation to . Thus, for any rank- matrix , we have

 ∥M−^XT^X∥≤∥M−A∥.

Now the second term can be bounded as follows:

 ∥∥M−J^XT^XJ∥∥ =∥∥M−^XT^X∥∥≤∥∥M+12JDJ∥∥ =12∥∥J(^D−D)J∥∥≤12∥D−^D∥,

where the first equality uses the fact that (see [20, pp. 3]), the first inequality comes from letting , and the last inequality uses again the submultiplicity of spectral norm and . Combining these two terms together, we have

 ∥∥JXTXJ−J^XT^XJ∥∥≤∥D−^D∥.

The conclusions follows by a simple application of Theorem 1:

 Edist(X,^X)≤√2dE∥D−^D∥≤C√dnm(ζ+ν).

### Iii-D Comparison and Discussion

In this section, we make some comparisons between our results and related works in the literature. First, we compare our results (Theorem 1) with that established in [1] for the SVD-Reconstruct approach, and show that our error bound has faster descent rate than that in [1]. Next, we compare our Theorem 2 with the results in [14], where the authors established a high probability error bound for a semidefinite programming approach, and conclude that the two results have the same order of error rate.

#### Iii-D1 Comparison with [1]

We begin by comparing our results (Theorem 2) with that in [1]. For convenience, we restate the results in [1] as the following proposition:

###### Proposition 1 (Theorem 2, [1]).

Let be the estimated distance matrix by SVD-Reconstruct without the symmetrization step. Then, with probability at least ,

 ∥D−~D∥F≤12σS√2n+8√σS√2n∥D∥F, (8)

where denotes an upper bound for the variance of the entries of , and is bounded by

 σ2S≤2pmaxi,j(D2ij+ν2). (9)

Note that in (8) there is no appearing because the authors fixed that , and in (9) there is no as it is assumed that in this paper.

To compare with our bound, substituting (9) into (8) and noting that , we see that (8) means that

 1n∥D−~D∥F≤C1ζ+ν√np+C2ζ+ν4√np≤C3ζ+ν4√np.

The last inequality holds because we have assumed that when is large enough (e.g., ). As a result, Proposition 1 implies

 P{1n∥^D−D∥F≤C3ζ+ν4√np}≥1−12n. (10)

In our Theorem 2, when , the tail bound becomes

 P{∥^D−D∥F≥t}≤nexp[−cmt2n3r(ζ+ν)2],

Assume that . Then, choosing

 t=(ζ+ν)n√r⋅√Cnlognm≤(ζ+ν)n√r,

we obtain

 P{1n∥^D−D∥F≥(ζ+ν)√Crnlognm}≤nexp(−