DeepAI

# Model reduction for transport-dominated problems via online adaptive bases and adaptive sampling

This work presents a model reduction approach for problems with coherent structures that propagate over time such as convection-dominated flows and wave-type phenomena. Traditional model reduction methods have difficulties with these transport-dominated problems because propagating coherent structures typically introduce high-dimensional features that require high-dimensional approximation spaces. The approach proposed in this work exploits the locality in space and time of propagating coherent structures to derive efficient reduced models. First, full-model solutions are approximated locally in time via local reduced spaces that are adapted with basis updates during time stepping. The basis updates are derived from querying the full model at a few selected spatial coordinates. Second, the locality in space of the coherent structures is exploited via an adaptive sampling scheme that selects at which components to query the full model for computing the basis updates. Our analysis shows that, in probability, the more local the coherent structure is in space, the fewer full-model samples are required to adapt the reduced basis with the proposed adaptive sampling scheme. Numerical results on benchmark examples with interacting wave-type structures and time-varying transport speeds and on a model combustor of a single-element rocket engine demonstrate the wide applicability of our approach and the significant runtime speedups compared to full models and traditional reduced models.

07/26/2020

### Rank-adaptive structure-preserving reduced basis methods for Hamiltonian systems

This work proposes an adaptive structure-preserving model order reductio...
01/04/2023

### An adaptive, training-free reduced-order model for convection-dominated problems based on hybrid snapshots

The vast majority of reduced-order models (ROMs) first obtain a low dime...
12/30/2019

### Manifold Approximations via Transported Subspaces: Model reduction for transport-dominated problems

This work presents a method for constructing online-efficient reduced mo...
01/09/2018

### Numerical Analysis of Automodel Solutions for Superdiffusive Transport

The distributed computing analysis of the accuracy of automodel solution...
09/17/2021

### Mapping of coherent structures in parameterized flows by learning optimal transportation with Gaussian models

We present a general (i.e., independent of the underlying model) interpo...
12/06/2022

### A Cost-Efficient Space-Time Adaptive Algorithm for Coupled Flow and Transport

In this work, a cost-efficient space-time adaptive algorithm based on th...
09/14/2022

### Reduced models with nonlinear approximations of latent dynamics for model premixed flame problems

Efficiently reducing models of chemically reacting flows is often challe...

## 1 Introduction

Model reduction constructs reduced models of large-scale systems of equations in a one-time high-cost offline phase and then uses the reduced models in an online phase to repeatedly compute accurate approximations of the full-model solutions with significantly reduced costs. In projection-based model reduction [46, 2, 6], a low-dimensional space is constructed that approximates well the high-dimensional solution space of the full model. Then, the full-model equations are projected onto the low-dimensional space and reduced solutions are computed by solving the projected equations. However, solutions of full models that describe transport-dominated behavior, e.g., convection-dominated flows, wave-type phenomena, shock propagation, typically exhibit high-dimensional features, which means that no low-dimensional space exists in which the full-model solutions can be approximated well; the Kolmogorov n-width is high. Thus, traditional model reduction methods typically fail for transport-dominated problems [34, 14, 51]. In this work, we exploit that transport-dominated problems typically have a rich structure that is local in nature, which we leverage to derive efficient reduced models.

Model reduction projects the full-model equations on a low-dimensional—reduced–space that is spanned by a set of basis vectors. There are many methods for constructing reduced spaces, including proper orthogonal decomposition (POD)

[7, 49], balanced truncation [29, 30], the reduced basis method [41, 53, 47, 24, 46], and interpolatory model reduction methods [23, 2]. In the following, we will be mostly concerned with full-model equations that are nonlinear in the states, where projecting the full-model equations onto a reduced space is typically insufficient to obtain a reduced model that is faster to solve than the full model, because the nonlinear terms entail computations that scale with the dimension of the full-model solution space. One remedy is empirical interpolation, which approximates nonlinear terms by evaluating them at a few, carefully selected interpolation points and approximating all other components via interpolation in a low-dimensional space [5, 22, 16]. We will build on the discrete empirical interpolation method (DEIM) [13, 15]

, which is the discrete counterpart of empirical interpolation. Note that there are other techniques for nonlinear model reduction, e.g., missing point estimation

[3]

and the Gauss-Newton with approximated tensors (GNAT) method

[12]. All these methods build on the assumption that there is a low-dimensional space in which the full-model solutions can be approximated well, which is violated in case of transport-dominated problems. However, the solutions of transport-dominated problems typically are low dimensional if considered locally in time, which we exploit by approximating the full-model solutions in local low-dimensional spaces that are adapted via low-rank basis updates over time [39, 62]. The basis updates are derived by querying the full model at selected points in the spatial domain. We derive an adaptive sampling scheme that selects where to query the full model to compute the basis updates. Our analysis shows that if the reduced-model residual is local in the spatial domain—which we observe for transport-dominated problems—then only few sampling points are necessary to adapt the local spaces with our adaptive sampling scheme, which makes our model reduction approach computationally efficient.

For reviewing the literature on model reduction for transport-dominated problems, we broadly distinguish between two lines of research. Literature that we categorize into the first line of research aims to transform the full model to recover a low-rank structure that can then be exploited with a reduced space. The work on transformations in the model reduction community seems to have originated from [34], where the transport dynamics are separated from the rest of the dynamics via freezing. In [44, 43], transport maps are constructed that reverse the transport and so recover a low-rank structure. Approaches have been developed that aim to find transformations numerically via, e.g., optimization [8]. Shifted POD [42] recovers the shift due to the transport and applies POD after having reversed the shift. The shift operator introduced in [42] is time dependent and separates different transports to be applicable to, e.g., multiple waves traveling at different wave speeds. In [56, 57], snapshots are transformed to better align them before the reduced bases are constructed. The second line of research on model reduction for transport-dominated problems constructs low-dimensional spaces that explicitly target transport-dominated problems. The work [14] formulates the greedy basis construction methods, developed in the reduced basis community [41, 53], in special norms that are better suited for transport-dominated problems. The work [1] constructs the basis via optimization and the authors of [52, 59] consider time-space discretizations. There is related literature on closure modeling [55, 18, 35, 36] where methods have been developed that construct bases with respect to different norms than the classical -optimal POD bases [25, 45]. Then, there are approaches that exploit local structure, such as the approach introduced in [51] that builds on the spatial locality of shock fronts. The work by Carlberg [11] builds on a special type of adaptation that enriches the reduced space during the online phase if an error indicator signals a high error. The dimension of the reduced space grows with each adaptation step, which means that the reduced model becomes computationally more expensive to solve over time. In [21], reduced bases are adapted over time via an auxiliary equation that describes the dynamics of the bases. Under certain conditions, the approach can be seen as an approximation of Lax pairs discussed in [26]. Dynamic low-rank approximation [27, 48, 33, 31, 32] adapts bases in time similar to our approach; however, the bases in dynamic low-rank approximation approaches are constructed so that they are valid for all parameters in a given parameter domain, which is problematic if the solution manifold contains high-dimensional features because the propagating coherent structure depends on parameters, as in, e.g., [51, Example 2.5]. In contrast, our approach is local in the parameter domain, additionally to being local in time and space. Furthermore, locality in the parameter domain allows our approach to derive basis updates from only few samples of the full model.

This work is organized as follows. Section 2 sets up the problem and gives the problem formulation. Section 3 demonstrates the local structure in transport-dominated problems and proposes an approach to adaptively sample the full model to construct basis updates. Numerical results in Section 4 on benchmark examples and a model combustor of a single-element rocket engine demonstrate that our approach achieves significant speedups and is applicable to a wide range of problems. In particular, the numerical results indicate that our approach faithfully approximates interactions between propagating coherent structures traveling at different speeds. Concluding remarks are provided in Section 5.

## 2 Preliminaries

We briefly discusses model reduction with empirical interpolation in Section 2.1 and then demonstrate on a toy example in Section 2.2 why these traditional model reduction methods fail for problems exhibiting transport-dominated phenomena.

### 2.1 Model reduction with empirical interpolation

Consider the system of discretized equations that is obtained after discretizing a partial differential equation (PDE) in space and time

 qk(μ)=f(qk+1(μ),μ),k=1,…,K, (1)

where is the -dimensional state at time and parameter , with parameter domain . The number of time steps is . The function describes the operators of the discretized PDE and typically is nonlinear in the state . The time discretization is implicit in time, which means that at each time step , a potentially nonlinear, large-scale system of equations has to be solved, e.g., with Newton’s method.

To derive a reduced model of the full model (1) with empirical interpolation [5, 13, 15], consider the trajectory at parameter

 Q(μ)=[q1(μ),…,qK(μ)]∈RN×K,

which is the matrix with the states as columns. Let the columns of be the POD basis of dimension obtained from the snapshot matrix

 Q=[Q(μ1),…,Q(μM)]∈RN×MK,

with parameters . The space spanned by the columns of is denoted as and is a subspace of

. The critical assumption of traditional model reduction is that the singular values of

decay fast so that only few basis vectors are necessary to approximate well the columns of in the corresponding space . Following QDEIM, introduced in [15], select the interpolation points and define the corresponding interpolation points matrix , where is the -th canonical unit vector with entry 1 at the -th component and entry 0 at all other components. Note that all of the following discussion can be extended in a straightforward way to oversampled empirical interpolation (ODEIM) [38]. Define

 ~f(q(μ);μ)=(PTU)−1PTf(q(μ);μ),

so that is the DEIM approximation of at state and parameter . Note that computing typically requires evaluating at the interpolation points only, see [5, 13, 15]. We overload the notation in the following so that if we have a reduced state , then . The reduced model corresponding to is

 ~qk(μ)=~f(~qk+1(μ);μ),k=1,…,K, (2)

with the reduced trajectory . Note that we approximate the state and the nonlinear function in the same space , which is in contrast to the original use of DEIM in [13] and similar to model reduction via missing point estimation [3]. Once a reduced model (2) is constructed in the offline phase, it is solved in the online phase. The one-time high costs of constructing the reduced model are compensated by approximating the full-model solutions with reduced-model solutions for a large number of parameters online, see, e.g., [46, 2, 6, 40] for details on the wide range of outer-loop and many-query applications where such an offline/online splitting is beneficial.

### 2.2 Problem formulation

It has been observed that states of problems with transport-dominated behavior can require DEIM (reduced) spaces with high dimensions, see, e.g., [34, 14, 51]. The following toy example illustrates this by demonstrating the slow decay of the singular values of the snapshot matrix of solutions of the advection equation. Let be the spatial domain and consider the advection equation

 ∂tq(x,t)+μ∂xq(x,t)=0,x∈Ω, (3)

with periodic boundary conditions and time . Set the initial condition to

 q(x,0)=1√π0.02exp(−x20.0002),

which is the probability density function of a normal random variable with standard deviation 0.01 and mean 0, see Figure

1a. Discretize (3) with a second-order upwind scheme and inner grid points in the spatial domain and time step size and end time . The singular values of the trajectory are plotted in Figure 1b for . According to the decay of the singular values, a DEIM space of more than dimensions is necessary to approximate the trajectory with a projection error below in the Euclidean norm, which is a rather slow decay compared to the fast decay observed in many other problems [6]. This example demonstrates that efficient reduced models of transport-dominated problems will have to exploit structure beyond the classical low-rank structure that traditional model reduction methods rely on.

We propose AADEIM (adaptive bases and adaptive sampling) to exploit local structure to construct online adaptive reduced models of full models that exhibit transport-dominated behavior. We focus on two types of local structure: Local low-rankness that we exploit via online adaptive bases and local coherence that enables updating the bases computationally efficient via few samples from the full model. Section 3.1 describes local low-rankness and local coherence in more detail. Section 3.2 discusses the adaptation of the DEIM basis to exploit local low-rank structure and Section 3.3 shows that only few samples are necessary to derive basis updates if the adapted DEIM spaces are locally coherent. Algorithm 1 in Section 3.4 summarizes our approach and provides practical considerations.

In most of this section, we drop the dependence on the parameter of the function , the state at time step , and the trajectory , as well as their reduced counterparts. Parametrization of our approach AADEIM is discussed in Section 3.4.

### 3.1 Local structure in transport-dominated problems

Consider the toy example introduced in Section 2.2. Let be a window size and consider the local trajectory at time step , which consists of the states from time steps to time step . Figure 2a compares the singular values of the local trajectory to the singular values of the global trajectory for and . The results indicate that the singular values of the local trajectory decay orders of magnitude faster than the singular values of the global trajectory . We call this behavior—that the singular values of local trajectories decay fast while the singular values of the global trajectory decay slowly—local low-rank structure. By adapting the DEIM spaces, we will exploit this local low-rank structure in Section 3.2. Similar observations about local low-rank structure are exploited in [27, 48, 33, 31, 32, 21, 37], cf. Section 1. In Appendix A, we analyze analytically an example to demonstrate its low-rank structure.

Let us now consider the local DEIM space of dimension and the corresponding local interpolation points matrix obtained from the local trajectory . The residual of approximating the state at time step with DEIM in is

 rk=qk−Uk(PTkUk)−1PTkqk.

Figure 2b shows the decay of the sorted squared components of . The decay is fast, which means that there is a high residual locally only, i.e., the residual at a few components dominates whereas at most of the components of the residual is low. Intuitively, such a fast decay of the residual means that the basis matrix of the local DEIM space needs to be corrected at a few components only. We will show that such a fast decay of the residual can be the result of a local coherence structure of the local DEIM spaces, which we will make precise and will exploit with adaptive sampling in Section 3.3.

### 3.2 Exploiting local low-rank structure: Basis updates

To exploit local low-rank structure as described in Section 3.1, we adapt the DEIM spaces and the DEIM interpolation points during the time steps in the online phase. The adaptation is initialized with the DEIM basis matrix and interpolation points matrix at time step . Then, at each time step , the DEIM basis matrix is adapted via an additive low-rank update to . The update is based on ADEIM [39]. The interpolation points matrix is derived with QDEIM from the adapted basis matrix . In the following description, the DEIM interpolant is adapted at each time step for ease of exposition, even though all of the following directly applies to situations where the adaptation is performed at selected time steps only, e.g., every other time step or via a criterion that decides adaptively when to update the DEIM interpolant.

Let be the current time step and and the DEIM basis matrix and the DEIM interpolation points matrix, respectively. Consider the matrix and the coefficient matrix . The residual of the DEIM approximation of the columns of is . Let now be the sampling points matrix corresponding to the samplings points with . ADEIM [39] adapts the DEIM basis matrix to with a low-rank update

 Uk+1=Uk+αkβTk,

with with rank . The ADEIM update is the rank- matrix that minimizes the residual at the sampling points in the Frobenius norm, which means that the ADEIM update minimizes

 ∥STk((Uk+αkβTk)Ck−Fk)∥2F,

see [39] for details on how to compute and the computational costs of computing the update.

Critical for the adaptation is the matrix , because ADEIM adapts the space such that the residual of approximating the columns of is minimized at the sampling points. A potentially good choice for the columns of would be the states of the full model at time steps ; however, the states of the full model are unavailable because their availability would mean the full model has been solved. Instead, we set the columns of as follows. Let be the complementary sampling points matrix derived from the points that have not been selected as sampling points. Let further be the vector with

 STk^qk=STkf(Uk~qk),˘STk^qk=˘STkUk(STkUk)+STkf(Uk~qk), (4)

which means that the components of corresponding to the sampling points in are equal to the components of and all other components are approximated via DEIM given by and . Note that and therefore the Moore-Penrose inverse is used in (4), instead of the inverse when and are used. Note further that is the reduced state at time and that is the function that defines the full model (1). The matrix that we use in the following for adaptation is

 Fk=[^qk−w+1,…,^qk].

The vectors serve as surrogates of the full-model states to which the DEIM space is adapted, because the full-model states satisfy for , see equation (1). Computing a vector requires evaluating the full-model function at the sampling points .

#### 3.2.2 Analysis of the error of the adapted space

We now provide an analysis of the ADEIM adaptation. Let and be two -dimensional subspaces of . We measure the distance between and as

 d(¯U,U)=∥¯U−UUT¯U∥2F, (5)

where and are orthonormal basis matrices of and , respectively. The distance is symmetric and invariant under orthogonal basis transformations, which is true because holds. The -dimensional subspace of to which we want to adapt at iteration is denoted as and the adapted space is . The following lemma quantifies the reduction of the residual after an ADEIM update and establishes Proposition 7 that bounds the error of the adapted space with respect to the space .

Let be the coefficient matrix and the residual matrix. Let further be the sampling points matrix and the adapted basis matrix of the adapted space . Let be the rank of and let with be the rank of the update . Then,

 ∥∥STk(Uk+1Ck−Fk)∥∥2F=∥STkRk∥2F−r∑i=1σ2i,

where are the singular values of .

###### Proof.

Follows from [39, Lemma 3.5] by transforming the generalized symmetric positive definite eigenproblem into the symmetric eigenproblem with matrix

, which then is equivalent to computing the singular value decomposition of

so that the squared singular values of

are equal to the eigenvalues of the generalized eigenproblem. This shows this lemma by using the last identity of

[39, proof of Lemma 3.5]. ∎

Consider the same setup as in Lemma 3.2.2 and additionally assume that has rank and its columns are in the -dimensional space . Let further be the adapted space derived with the rank- ADEIM update and sampling points matrix . Then, it holds

 d(Uk+1,¯Uk+1)≤ρ2kσ2min(Fk) (6)

with

 ρ2k=∥˘STkRk∥2F+¯r∑i=r+1σ2i, (7)

where denotes the complementary sampling matrix of and is the minimal singular value of .

###### Proof.

First, note that the matrix has zero entries only, because the ADEIM update updates only rows of corresponding to the sampling points , see [39, Lemma 3.5]. Thus, it holds

 ˘STkUk+1=˘STkUk. (8)

Consider now the norm of the residual with respect to the adapted space

 ∥Uk+1Ck−Fk∥2F= ∥STk(Uk+1Ck−Fk)∥2F+∥˘STk(Uk+1Ck−Fk)∥2F (9) = ∥STk(UkCk−Fk)∥2F−r∑i=1σ2i+∥˘STk(Uk+1Ck−Fk)∥2F (10) = ∥STk(UkCk−Fk)∥2F−r∑i=1σ2i+∥˘STk(UkCk−Fk)∥2F (11) = ¯r∑i=r+1σ2i+∥˘STkRk∥2F. (12)

Equation (9) follows because is the complementary sampling points matrix of . Equation (10) follows from Lemma 3.2.2. Equation (11) holds because of (8). Equation (12) uses and that . Consider now the projection of the columns of onto and observe that

 ∥Fk−Uk+1UTk+1Fk∥2F≤∥Uk+1Ck−Fk∥2F. (13)

Note that the basis matrix obtained with the ADEIM update is not necessarily orthonormal and therefore it might be necessary to consider the oblique projection onto the column span of in (13); however, the projection error in the Frobenius norm is the same for both projections. With (12) and the definition of follows

 ∥Fk−Uk+1UTk+1Fk∥2F≤ρ2k. (14)

Let now be an orthonormal basis matrix of . Since is spanned by the columns of , there exists a full-rank matrix such that . We obtain

 ∥Fk−Uk+1UTk+1Fk∥2F= ∥(¯Uk+1−Uk+1UTk+1¯Uk+1)~Fk∥2F (15) ≥ ∥¯Uk+1−Uk+1UTk+1¯Uk+1∥2Fσ2min(~Fk) = ∥¯Uk+1−Uk+1UTk+1¯Uk+1∥2Fσ2min(Fk),

where we used , which holds because is orthonormal. The matrix has rank and therefore the smallest singular value is positive. Combining (15) with (14) and the definition of shows (6).

### 3.3 Exploiting local coherence: Adaptive sampling

Proposition 7 shows that the choice of the sampling points influences the bound of the error of the adapted space . In this section, we derive an adaptive sampling strategy that minimizes the upper bound derived in Proposition 7 in case of full-rank ADEIM updates. Then, we show that if our adaptive sampling strategy is used, the error decays at least as fast with the number of sampling points as the norm of the DEIM residual, which in turn means that only few sampling points are necessary to adapt the basis if the residual decays fast.

#### 3.3.1 Adaptive sampling strategy based on residual

Following Proposition 7 and the decay factor defined in (7), we select the sampling points so that is minimized. Note that in (6) is independent of the sampling points and therefore it is sufficient to derive sampling points that lead to a small decay factor .

Consider the component-wise residual

 ri=∥RTkei∥22 (16)

and let be an ordering of such that . Then, select the first components as sampling points and form the corresponding sampling points matrix

 Sk=[ej1,…,ejm]. (17)

If a full-rank ADEIM update is applied, i.e., in Proposition 7, then this choice of sampling points is optimal in the sense that the bound is minimized.

We now show that a fast decay in the residual implies a fast decay in the error bound of .

Consider the same setup as in Proposition 7. Let be an ordering of such that the component-wise residual defined in (16) decays as

 rji≤c1e−c2ia, (18)

with rate and constants . The error of the adapted space is bounded as

 d(¯Uk+1,Uk+1)≤c3e−c2ma, (19)

if a full-rank ADEIM update is applied and sampling points are selected via the adaptive sampling (17). The constant is independent of . In particular, setting the number of sampling points to

 m≥min{N,(−1c2log(ϵc3))1/a} (20)

guarantees for a threshold .

###### Proof.

In case of a full-rank update, the factor in (6) is

 ρ2k=∥˘STkRk∥2F,

as shown in Proposition 7. Then, we obtain with (18) and the adaptive sampling (17)

 ∥˘STkRk∥2F= N∑i=m+1∥RTkeji∥22≤c1N∑i=m+1e−c2ia≤c1∞∑i=m+1e−c2ia ≤ c1∞∑i=0e−c2(i+(m+1))a≤c1e−c2(m+1)a∞∑i=0e−c2ia,

where the last inequality holds because and thus . Using that , we obtain and and thus

 ∥˘STkRk∥2F≤c1e−c2(m+1)a∞∑i=0e−c2ia≤c1e−c2ma∞∑i=0e−c2i=c1e−c2ma11−e−c2. (21)

Set to obtain (19) with Proposition 7. Setting as in (20) shows . ∎

#### 3.3.2 Local coherence

Following, e.g., [9], define the coherence of a space as

 γ(U)=Nnmaxi=1,…,N∥UTei∥22,

where is an orthonormal basis of and is the -th canonical unit vector. Define further the local coherence as

 γi(U)=Nn∥UTei∥22,

for , see, e.g., [28, 60]. We refer to [10, 9, 28, 60] for details.

We now show that the rate of the decay of the local coherence of the current DEIM space and of the space to which we want to adapt is reflected in the decay of the bound of the error of the adapted space with respect to the number of sampling points . Thus, we now relate the decay of the error of the adapted space to a property of the spaces and , namely their decay of the local coherence.

Let be an ordering of such that

 γji(Uk)≤c4exp(−c5ia),γji(¯Uk+1)≤¯c4exp(−¯c5i¯a) (22)

holds for , with and . Let the columns of be in . Define , then for the residual holds

 (23)

with a constant that is independent of .

###### Proof.

Denote with , , and the -th row of , , and , respectively. The matrix is an orthonormal basis matrix of . Let further be a matrix such that . Note that , because is orthonormal. Then, with , follows

 ∥RTkeji∥22= ∥F(ji)k−U(ji)kCk∥22 = ∥F(ji)k∥22−2F(ji)k(U(ji)kCk)T+∥U(ji)kCk∥22 ≤ ∥F(ji)k∥22+2∥F(ji)k∥2∥U(ji)k∥2∥Ck∥2+∥U(ji)k∥22∥Ck∥22.

Now make the following estimate

 ∥F(ji)k∥22≤∥¯U(ji)k+1∥22∥~Fk∥22=∥¯U(ji)k+1∥22∥Fk∥22,

because . Further, we have

 ∥Ck∥2≤∥(PTkUk)−1∥2∥Fk∥2.

Set now and to bound

 ∥RTkeji∥22≤∥Fk∥22(¯c6e−¯c5ia+2√¯c6c6e−¯c52i¯a−c52ia+c6e−c5ia).

Now set to obtain

 ∥RTkeji∥22≤c7∥Fk∥22e−min{c5,¯c5}imin{a,¯a},

which shows the proposition. ∎

Since we consider subspaces of finite-dimensional spaces only, i.e., is finite, the bounds in (22) hold for any subspace by increasing the constants and choosing the rate close to 1. However, Lemma 23 still is meaningful because it shows that the constants and rates that appear in (22) are obtained in the bound of the residual in (23) as well. Thus, the local coherence structure is directly reflected in the decay of the DEIM residual.

The following proposition combines the local coherence structure exploited in Lemma 23 and the adaptive sampling of Proposition 20 to derive the number of sampling points that are required to achieve in probability.

Assume that Lemma 23 applies and consider the same setting as in Proposition 7 except that with being an matrix with independent and identically distributed standard Gaussian entries. Then, with probability at least if a full-rank update is applied and

 m≥min⎧⎨⎩N,(−1min{c5,¯c5}log(ϵδ2c8))1/min{a,¯a}⎫⎬⎭, (24)

where is a constant independent of , and .

###### Proof.

Since Lemma 23 applies, we have with (21) that

 ∥˘STkRk∥2F≤c7∥Fk∥221−e−min{c5,¯c5}e−min{c5,