1 Introduction
Model reduction constructs reduced models of largescale systems of equations in a onetime highcost offline phase and then uses the reduced models in an online phase to repeatedly compute accurate approximations of the fullmodel solutions with significantly reduced costs. In projectionbased model reduction [46, 2, 6], a lowdimensional space is constructed that approximates well the highdimensional solution space of the full model. Then, the fullmodel equations are projected onto the lowdimensional space and reduced solutions are computed by solving the projected equations. However, solutions of full models that describe transportdominated behavior, e.g., convectiondominated flows, wavetype phenomena, shock propagation, typically exhibit highdimensional features, which means that no lowdimensional space exists in which the fullmodel solutions can be approximated well; the Kolmogorov nwidth is high. Thus, traditional model reduction methods typically fail for transportdominated problems [34, 14, 51]. In this work, we exploit that transportdominated problems typically have a rich structure that is local in nature, which we leverage to derive efficient reduced models.
Model reduction projects the fullmodel equations on a lowdimensional—reduced–space that is spanned by a set of basis vectors. There are many methods for constructing reduced spaces, including proper orthogonal decomposition (POD)
[7, 49], balanced truncation [29, 30], the reduced basis method [41, 53, 47, 24, 46], and interpolatory model reduction methods [23, 2]. In the following, we will be mostly concerned with fullmodel equations that are nonlinear in the states, where projecting the fullmodel equations onto a reduced space is typically insufficient to obtain a reduced model that is faster to solve than the full model, because the nonlinear terms entail computations that scale with the dimension of the fullmodel solution space. One remedy is empirical interpolation, which approximates nonlinear terms by evaluating them at a few, carefully selected interpolation points and approximating all other components via interpolation in a lowdimensional space [5, 22, 16]. We will build on the discrete empirical interpolation method (DEIM) [13, 15], which is the discrete counterpart of empirical interpolation. Note that there are other techniques for nonlinear model reduction, e.g., missing point estimation
[3]and the GaussNewton with approximated tensors (GNAT) method
[12]. All these methods build on the assumption that there is a lowdimensional space in which the fullmodel solutions can be approximated well, which is violated in case of transportdominated problems. However, the solutions of transportdominated problems typically are low dimensional if considered locally in time, which we exploit by approximating the fullmodel solutions in local lowdimensional spaces that are adapted via lowrank basis updates over time [39, 62]. The basis updates are derived by querying the full model at selected points in the spatial domain. We derive an adaptive sampling scheme that selects where to query the full model to compute the basis updates. Our analysis shows that if the reducedmodel residual is local in the spatial domain—which we observe for transportdominated problems—then only few sampling points are necessary to adapt the local spaces with our adaptive sampling scheme, which makes our model reduction approach computationally efficient.For reviewing the literature on model reduction for transportdominated problems, we broadly distinguish between two lines of research. Literature that we categorize into the first line of research aims to transform the full model to recover a lowrank structure that can then be exploited with a reduced space. The work on transformations in the model reduction community seems to have originated from [34], where the transport dynamics are separated from the rest of the dynamics via freezing. In [44, 43], transport maps are constructed that reverse the transport and so recover a lowrank structure. Approaches have been developed that aim to find transformations numerically via, e.g., optimization [8]. Shifted POD [42] recovers the shift due to the transport and applies POD after having reversed the shift. The shift operator introduced in [42] is time dependent and separates different transports to be applicable to, e.g., multiple waves traveling at different wave speeds. In [56, 57], snapshots are transformed to better align them before the reduced bases are constructed. The second line of research on model reduction for transportdominated problems constructs lowdimensional spaces that explicitly target transportdominated problems. The work [14] formulates the greedy basis construction methods, developed in the reduced basis community [41, 53], in special norms that are better suited for transportdominated problems. The work [1] constructs the basis via optimization and the authors of [52, 59] consider timespace discretizations. There is related literature on closure modeling [55, 18, 35, 36] where methods have been developed that construct bases with respect to different norms than the classical optimal POD bases [25, 45]. Then, there are approaches that exploit local structure, such as the approach introduced in [51] that builds on the spatial locality of shock fronts. The work by Carlberg [11] builds on a special type of adaptation that enriches the reduced space during the online phase if an error indicator signals a high error. The dimension of the reduced space grows with each adaptation step, which means that the reduced model becomes computationally more expensive to solve over time. In [21], reduced bases are adapted over time via an auxiliary equation that describes the dynamics of the bases. Under certain conditions, the approach can be seen as an approximation of Lax pairs discussed in [26]. Dynamic lowrank approximation [27, 48, 33, 31, 32] adapts bases in time similar to our approach; however, the bases in dynamic lowrank approximation approaches are constructed so that they are valid for all parameters in a given parameter domain, which is problematic if the solution manifold contains highdimensional features because the propagating coherent structure depends on parameters, as in, e.g., [51, Example 2.5]. In contrast, our approach is local in the parameter domain, additionally to being local in time and space. Furthermore, locality in the parameter domain allows our approach to derive basis updates from only few samples of the full model.
This work is organized as follows. Section 2 sets up the problem and gives the problem formulation. Section 3 demonstrates the local structure in transportdominated problems and proposes an approach to adaptively sample the full model to construct basis updates. Numerical results in Section 4 on benchmark examples and a model combustor of a singleelement rocket engine demonstrate that our approach achieves significant speedups and is applicable to a wide range of problems. In particular, the numerical results indicate that our approach faithfully approximates interactions between propagating coherent structures traveling at different speeds. Concluding remarks are provided in Section 5.
2 Preliminaries
We briefly discusses model reduction with empirical interpolation in Section 2.1 and then demonstrate on a toy example in Section 2.2 why these traditional model reduction methods fail for problems exhibiting transportdominated phenomena.
2.1 Model reduction with empirical interpolation
Consider the system of discretized equations that is obtained after discretizing a partial differential equation (PDE) in space and time
(1) 
where is the dimensional state at time and parameter , with parameter domain . The number of time steps is . The function describes the operators of the discretized PDE and typically is nonlinear in the state . The time discretization is implicit in time, which means that at each time step , a potentially nonlinear, largescale system of equations has to be solved, e.g., with Newton’s method.
To derive a reduced model of the full model (1) with empirical interpolation [5, 13, 15], consider the trajectory at parameter
which is the matrix with the states as columns. Let the columns of be the POD basis of dimension obtained from the snapshot matrix
with parameters . The space spanned by the columns of is denoted as and is a subspace of
. The critical assumption of traditional model reduction is that the singular values of
decay fast so that only few basis vectors are necessary to approximate well the columns of in the corresponding space . Following QDEIM, introduced in [15], select the interpolation points and define the corresponding interpolation points matrix , where is the th canonical unit vector with entry 1 at the th component and entry 0 at all other components. Note that all of the following discussion can be extended in a straightforward way to oversampled empirical interpolation (ODEIM) [38]. Defineso that is the DEIM approximation of at state and parameter . Note that computing typically requires evaluating at the interpolation points only, see [5, 13, 15]. We overload the notation in the following so that if we have a reduced state , then . The reduced model corresponding to is
(2) 
with the reduced trajectory . Note that we approximate the state and the nonlinear function in the same space , which is in contrast to the original use of DEIM in [13] and similar to model reduction via missing point estimation [3]. Once a reduced model (2) is constructed in the offline phase, it is solved in the online phase. The onetime high costs of constructing the reduced model are compensated by approximating the fullmodel solutions with reducedmodel solutions for a large number of parameters online, see, e.g., [46, 2, 6, 40] for details on the wide range of outerloop and manyquery applications where such an offline/online splitting is beneficial.



(a) initial condition  (b) singular values 
2.2 Problem formulation
It has been observed that states of problems with transportdominated behavior can require DEIM (reduced) spaces with high dimensions, see, e.g., [34, 14, 51]. The following toy example illustrates this by demonstrating the slow decay of the singular values of the snapshot matrix of solutions of the advection equation. Let be the spatial domain and consider the advection equation
(3) 
with periodic boundary conditions and time . Set the initial condition to
which is the probability density function of a normal random variable with standard deviation 0.01 and mean 0, see Figure
1a. Discretize (3) with a secondorder upwind scheme and inner grid points in the spatial domain and time step size and end time . The singular values of the trajectory are plotted in Figure 1b for . According to the decay of the singular values, a DEIM space of more than dimensions is necessary to approximate the trajectory with a projection error below in the Euclidean norm, which is a rather slow decay compared to the fast decay observed in many other problems [6]. This example demonstrates that efficient reduced models of transportdominated problems will have to exploit structure beyond the classical lowrank structure that traditional model reduction methods rely on.



(a) local lowrank structure  (b) decay of residual (local coherence) 
3 Exploiting local structure via online adaptive basis updates
We propose AADEIM (adaptive bases and adaptive sampling) to exploit local structure to construct online adaptive reduced models of full models that exhibit transportdominated behavior. We focus on two types of local structure: Local lowrankness that we exploit via online adaptive bases and local coherence that enables updating the bases computationally efficient via few samples from the full model. Section 3.1 describes local lowrankness and local coherence in more detail. Section 3.2 discusses the adaptation of the DEIM basis to exploit local lowrank structure and Section 3.3 shows that only few samples are necessary to derive basis updates if the adapted DEIM spaces are locally coherent. Algorithm 1 in Section 3.4 summarizes our approach and provides practical considerations.
In most of this section, we drop the dependence on the parameter of the function , the state at time step , and the trajectory , as well as their reduced counterparts. Parametrization of our approach AADEIM is discussed in Section 3.4.
3.1 Local structure in transportdominated problems
Consider the toy example introduced in Section 2.2. Let be a window size and consider the local trajectory at time step , which consists of the states from time steps to time step . Figure 2a compares the singular values of the local trajectory to the singular values of the global trajectory for and . The results indicate that the singular values of the local trajectory decay orders of magnitude faster than the singular values of the global trajectory . We call this behavior—that the singular values of local trajectories decay fast while the singular values of the global trajectory decay slowly—local lowrank structure. By adapting the DEIM spaces, we will exploit this local lowrank structure in Section 3.2. Similar observations about local lowrank structure are exploited in [27, 48, 33, 31, 32, 21, 37], cf. Section 1. In Appendix A, we analyze analytically an example to demonstrate its lowrank structure.
Let us now consider the local DEIM space of dimension and the corresponding local interpolation points matrix obtained from the local trajectory . The residual of approximating the state at time step with DEIM in is
Figure 2b shows the decay of the sorted squared components of . The decay is fast, which means that there is a high residual locally only, i.e., the residual at a few components dominates whereas at most of the components of the residual is low. Intuitively, such a fast decay of the residual means that the basis matrix of the local DEIM space needs to be corrected at a few components only. We will show that such a fast decay of the residual can be the result of a local coherence structure of the local DEIM spaces, which we will make precise and will exploit with adaptive sampling in Section 3.3.
3.2 Exploiting local lowrank structure: Basis updates
To exploit local lowrank structure as described in Section 3.1, we adapt the DEIM spaces and the DEIM interpolation points during the time steps in the online phase. The adaptation is initialized with the DEIM basis matrix and interpolation points matrix at time step . Then, at each time step , the DEIM basis matrix is adapted via an additive lowrank update to . The update is based on ADEIM [39]. The interpolation points matrix is derived with QDEIM from the adapted basis matrix . In the following description, the DEIM interpolant is adapted at each time step for ease of exposition, even though all of the following directly applies to situations where the adaptation is performed at selected time steps only, e.g., every other time step or via a criterion that decides adaptively when to update the DEIM interpolant.
3.2.1 Adaptation with ADEIM
Let be the current time step and and the DEIM basis matrix and the DEIM interpolation points matrix, respectively. Consider the matrix and the coefficient matrix . The residual of the DEIM approximation of the columns of is . Let now be the sampling points matrix corresponding to the samplings points with . ADEIM [39] adapts the DEIM basis matrix to with a lowrank update
with with rank . The ADEIM update is the rank matrix that minimizes the residual at the sampling points in the Frobenius norm, which means that the ADEIM update minimizes
see [39] for details on how to compute and the computational costs of computing the update.
Critical for the adaptation is the matrix , because ADEIM adapts the space such that the residual of approximating the columns of is minimized at the sampling points. A potentially good choice for the columns of would be the states of the full model at time steps ; however, the states of the full model are unavailable because their availability would mean the full model has been solved. Instead, we set the columns of as follows. Let be the complementary sampling points matrix derived from the points that have not been selected as sampling points. Let further be the vector with
(4) 
which means that the components of corresponding to the sampling points in are equal to the components of and all other components are approximated via DEIM given by and . Note that and therefore the MoorePenrose inverse is used in (4), instead of the inverse when and are used. Note further that is the reduced state at time and that is the function that defines the full model (1). The matrix that we use in the following for adaptation is
The vectors serve as surrogates of the fullmodel states to which the DEIM space is adapted, because the fullmodel states satisfy for , see equation (1). Computing a vector requires evaluating the fullmodel function at the sampling points .
3.2.2 Analysis of the error of the adapted space
We now provide an analysis of the ADEIM adaptation. Let and be two dimensional subspaces of . We measure the distance between and as
(5) 
where and are orthonormal basis matrices of and , respectively. The distance is symmetric and invariant under orthogonal basis transformations, which is true because holds. The dimensional subspace of to which we want to adapt at iteration is denoted as and the adapted space is . The following lemma quantifies the reduction of the residual after an ADEIM update and establishes Proposition 7 that bounds the error of the adapted space with respect to the space .
Let be the coefficient matrix and the residual matrix. Let further be the sampling points matrix and the adapted basis matrix of the adapted space . Let be the rank of and let with be the rank of the update . Then,
where are the singular values of .
Proof.
Follows from [39, Lemma 3.5] by transforming the generalized symmetric positive definite eigenproblem into the symmetric eigenproblem with matrix
, which then is equivalent to computing the singular value decomposition of
so that the squared singular values ofare equal to the eigenvalues of the generalized eigenproblem. This shows this lemma by using the last identity of
[39, proof of Lemma 3.5]. ∎Consider the same setup as in Lemma 3.2.2 and additionally assume that has rank and its columns are in the dimensional space . Let further be the adapted space derived with the rank ADEIM update and sampling points matrix . Then, it holds
(6) 
with
(7) 
where denotes the complementary sampling matrix of and is the minimal singular value of .
Proof.
First, note that the matrix has zero entries only, because the ADEIM update updates only rows of corresponding to the sampling points , see [39, Lemma 3.5]. Thus, it holds
(8) 
Consider now the norm of the residual with respect to the adapted space
(9)  
(10)  
(11)  
(12) 
Equation (9) follows because is the complementary sampling points matrix of . Equation (10) follows from Lemma 3.2.2. Equation (11) holds because of (8). Equation (12) uses and that . Consider now the projection of the columns of onto and observe that
(13) 
Note that the basis matrix obtained with the ADEIM update is not necessarily orthonormal and therefore it might be necessary to consider the oblique projection onto the column span of in (13); however, the projection error in the Frobenius norm is the same for both projections. With (12) and the definition of follows
(14) 
Let now be an orthonormal basis matrix of . Since is spanned by the columns of , there exists a fullrank matrix such that . We obtain
(15)  
where we used , which holds because is orthonormal. The matrix has rank and therefore the smallest singular value is positive. Combining (15) with (14) and the definition of shows (6).
∎
3.3 Exploiting local coherence: Adaptive sampling
Proposition 7 shows that the choice of the sampling points influences the bound of the error of the adapted space . In this section, we derive an adaptive sampling strategy that minimizes the upper bound derived in Proposition 7 in case of fullrank ADEIM updates. Then, we show that if our adaptive sampling strategy is used, the error decays at least as fast with the number of sampling points as the norm of the DEIM residual, which in turn means that only few sampling points are necessary to adapt the basis if the residual decays fast.
3.3.1 Adaptive sampling strategy based on residual
Following Proposition 7 and the decay factor defined in (7), we select the sampling points so that is minimized. Note that in (6) is independent of the sampling points and therefore it is sufficient to derive sampling points that lead to a small decay factor .
Consider the componentwise residual
(16) 
and let be an ordering of such that . Then, select the first components as sampling points and form the corresponding sampling points matrix
(17) 
If a fullrank ADEIM update is applied, i.e., in Proposition 7, then this choice of sampling points is optimal in the sense that the bound is minimized.
We now show that a fast decay in the residual implies a fast decay in the error bound of .
Consider the same setup as in Proposition 7. Let be an ordering of such that the componentwise residual defined in (16) decays as
(18) 
with rate and constants . The error of the adapted space is bounded as
(19) 
if a fullrank ADEIM update is applied and sampling points are selected via the adaptive sampling (17). The constant is independent of . In particular, setting the number of sampling points to
(20) 
guarantees for a threshold .
Proof.
3.3.2 Local coherence
Following, e.g., [9], define the coherence of a space as
where is an orthonormal basis of and is the th canonical unit vector. Define further the local coherence as
for , see, e.g., [28, 60]. We refer to [10, 9, 28, 60] for details.
We now show that the rate of the decay of the local coherence of the current DEIM space and of the space to which we want to adapt is reflected in the decay of the bound of the error of the adapted space with respect to the number of sampling points . Thus, we now relate the decay of the error of the adapted space to a property of the spaces and , namely their decay of the local coherence.
Let be an ordering of such that
(22) 
holds for , with and . Let the columns of be in . Define , then for the residual holds
(23) 
with a constant that is independent of .
Proof.
Denote with , , and the th row of , , and , respectively. The matrix is an orthonormal basis matrix of . Let further be a matrix such that . Note that , because is orthonormal. Then, with , follows
Now make the following estimate
because . Further, we have
Set now and to bound
Now set to obtain
which shows the proposition. ∎
Since we consider subspaces of finitedimensional spaces only, i.e., is finite, the bounds in (22) hold for any subspace by increasing the constants and choosing the rate close to 1. However, Lemma 23 still is meaningful because it shows that the constants and rates that appear in (22) are obtained in the bound of the residual in (23) as well. Thus, the local coherence structure is directly reflected in the decay of the DEIM residual.
The following proposition combines the local coherence structure exploited in Lemma 23 and the adaptive sampling of Proposition 20 to derive the number of sampling points that are required to achieve in probability.
Assume that Lemma 23 applies and consider the same setting as in Proposition 7 except that with being an matrix with independent and identically distributed standard Gaussian entries. Then, with probability at least if a fullrank update is applied and
(24) 
where is a constant independent of , and .