1 Introduction
Model reduction constructs reduced models of large-scale systems of equations in a one-time high-cost offline phase and then uses the reduced models in an online phase to repeatedly compute accurate approximations of the full-model solutions with significantly reduced costs. In projection-based model reduction [46, 2, 6], a low-dimensional space is constructed that approximates well the high-dimensional solution space of the full model. Then, the full-model equations are projected onto the low-dimensional space and reduced solutions are computed by solving the projected equations. However, solutions of full models that describe transport-dominated behavior, e.g., convection-dominated flows, wave-type phenomena, shock propagation, typically exhibit high-dimensional features, which means that no low-dimensional space exists in which the full-model solutions can be approximated well; the Kolmogorov n-width is high. Thus, traditional model reduction methods typically fail for transport-dominated problems [34, 14, 51]. In this work, we exploit that transport-dominated problems typically have a rich structure that is local in nature, which we leverage to derive efficient reduced models.
Model reduction projects the full-model equations on a low-dimensional—reduced–space that is spanned by a set of basis vectors. There are many methods for constructing reduced spaces, including proper orthogonal decomposition (POD)
[7, 49], balanced truncation [29, 30], the reduced basis method [41, 53, 47, 24, 46], and interpolatory model reduction methods [23, 2]. In the following, we will be mostly concerned with full-model equations that are nonlinear in the states, where projecting the full-model equations onto a reduced space is typically insufficient to obtain a reduced model that is faster to solve than the full model, because the nonlinear terms entail computations that scale with the dimension of the full-model solution space. One remedy is empirical interpolation, which approximates nonlinear terms by evaluating them at a few, carefully selected interpolation points and approximating all other components via interpolation in a low-dimensional space [5, 22, 16]. We will build on the discrete empirical interpolation method (DEIM) [13, 15], which is the discrete counterpart of empirical interpolation. Note that there are other techniques for nonlinear model reduction, e.g., missing point estimation
[3]and the Gauss-Newton with approximated tensors (GNAT) method
[12]. All these methods build on the assumption that there is a low-dimensional space in which the full-model solutions can be approximated well, which is violated in case of transport-dominated problems. However, the solutions of transport-dominated problems typically are low dimensional if considered locally in time, which we exploit by approximating the full-model solutions in local low-dimensional spaces that are adapted via low-rank basis updates over time [39, 62]. The basis updates are derived by querying the full model at selected points in the spatial domain. We derive an adaptive sampling scheme that selects where to query the full model to compute the basis updates. Our analysis shows that if the reduced-model residual is local in the spatial domain—which we observe for transport-dominated problems—then only few sampling points are necessary to adapt the local spaces with our adaptive sampling scheme, which makes our model reduction approach computationally efficient.For reviewing the literature on model reduction for transport-dominated problems, we broadly distinguish between two lines of research. Literature that we categorize into the first line of research aims to transform the full model to recover a low-rank structure that can then be exploited with a reduced space. The work on transformations in the model reduction community seems to have originated from [34], where the transport dynamics are separated from the rest of the dynamics via freezing. In [44, 43], transport maps are constructed that reverse the transport and so recover a low-rank structure. Approaches have been developed that aim to find transformations numerically via, e.g., optimization [8]. Shifted POD [42] recovers the shift due to the transport and applies POD after having reversed the shift. The shift operator introduced in [42] is time dependent and separates different transports to be applicable to, e.g., multiple waves traveling at different wave speeds. In [56, 57], snapshots are transformed to better align them before the reduced bases are constructed. The second line of research on model reduction for transport-dominated problems constructs low-dimensional spaces that explicitly target transport-dominated problems. The work [14] formulates the greedy basis construction methods, developed in the reduced basis community [41, 53], in special norms that are better suited for transport-dominated problems. The work [1] constructs the basis via optimization and the authors of [52, 59] consider time-space discretizations. There is related literature on closure modeling [55, 18, 35, 36] where methods have been developed that construct bases with respect to different norms than the classical -optimal POD bases [25, 45]. Then, there are approaches that exploit local structure, such as the approach introduced in [51] that builds on the spatial locality of shock fronts. The work by Carlberg [11] builds on a special type of adaptation that enriches the reduced space during the online phase if an error indicator signals a high error. The dimension of the reduced space grows with each adaptation step, which means that the reduced model becomes computationally more expensive to solve over time. In [21], reduced bases are adapted over time via an auxiliary equation that describes the dynamics of the bases. Under certain conditions, the approach can be seen as an approximation of Lax pairs discussed in [26]. Dynamic low-rank approximation [27, 48, 33, 31, 32] adapts bases in time similar to our approach; however, the bases in dynamic low-rank approximation approaches are constructed so that they are valid for all parameters in a given parameter domain, which is problematic if the solution manifold contains high-dimensional features because the propagating coherent structure depends on parameters, as in, e.g., [51, Example 2.5]. In contrast, our approach is local in the parameter domain, additionally to being local in time and space. Furthermore, locality in the parameter domain allows our approach to derive basis updates from only few samples of the full model.
This work is organized as follows. Section 2 sets up the problem and gives the problem formulation. Section 3 demonstrates the local structure in transport-dominated problems and proposes an approach to adaptively sample the full model to construct basis updates. Numerical results in Section 4 on benchmark examples and a model combustor of a single-element rocket engine demonstrate that our approach achieves significant speedups and is applicable to a wide range of problems. In particular, the numerical results indicate that our approach faithfully approximates interactions between propagating coherent structures traveling at different speeds. Concluding remarks are provided in Section 5.
2 Preliminaries
We briefly discusses model reduction with empirical interpolation in Section 2.1 and then demonstrate on a toy example in Section 2.2 why these traditional model reduction methods fail for problems exhibiting transport-dominated phenomena.
2.1 Model reduction with empirical interpolation
Consider the system of discretized equations that is obtained after discretizing a partial differential equation (PDE) in space and time
(1) |
where is the -dimensional state at time and parameter , with parameter domain . The number of time steps is . The function describes the operators of the discretized PDE and typically is nonlinear in the state . The time discretization is implicit in time, which means that at each time step , a potentially nonlinear, large-scale system of equations has to be solved, e.g., with Newton’s method.
To derive a reduced model of the full model (1) with empirical interpolation [5, 13, 15], consider the trajectory at parameter
which is the matrix with the states as columns. Let the columns of be the POD basis of dimension obtained from the snapshot matrix
with parameters . The space spanned by the columns of is denoted as and is a subspace of
. The critical assumption of traditional model reduction is that the singular values of
decay fast so that only few basis vectors are necessary to approximate well the columns of in the corresponding space . Following QDEIM, introduced in [15], select the interpolation points and define the corresponding interpolation points matrix , where is the -th canonical unit vector with entry 1 at the -th component and entry 0 at all other components. Note that all of the following discussion can be extended in a straightforward way to oversampled empirical interpolation (ODEIM) [38]. Defineso that is the DEIM approximation of at state and parameter . Note that computing typically requires evaluating at the interpolation points only, see [5, 13, 15]. We overload the notation in the following so that if we have a reduced state , then . The reduced model corresponding to is
(2) |
with the reduced trajectory . Note that we approximate the state and the nonlinear function in the same space , which is in contrast to the original use of DEIM in [13] and similar to model reduction via missing point estimation [3]. Once a reduced model (2) is constructed in the offline phase, it is solved in the online phase. The one-time high costs of constructing the reduced model are compensated by approximating the full-model solutions with reduced-model solutions for a large number of parameters online, see, e.g., [46, 2, 6, 40] for details on the wide range of outer-loop and many-query applications where such an offline/online splitting is beneficial.
|
|
---|---|
(a) initial condition | (b) singular values |
2.2 Problem formulation
It has been observed that states of problems with transport-dominated behavior can require DEIM (reduced) spaces with high dimensions, see, e.g., [34, 14, 51]. The following toy example illustrates this by demonstrating the slow decay of the singular values of the snapshot matrix of solutions of the advection equation. Let be the spatial domain and consider the advection equation
(3) |
with periodic boundary conditions and time . Set the initial condition to
which is the probability density function of a normal random variable with standard deviation 0.01 and mean 0, see Figure
1a. Discretize (3) with a second-order upwind scheme and inner grid points in the spatial domain and time step size and end time . The singular values of the trajectory are plotted in Figure 1b for . According to the decay of the singular values, a DEIM space of more than dimensions is necessary to approximate the trajectory with a projection error below in the Euclidean norm, which is a rather slow decay compared to the fast decay observed in many other problems [6]. This example demonstrates that efficient reduced models of transport-dominated problems will have to exploit structure beyond the classical low-rank structure that traditional model reduction methods rely on.
|
|
---|---|
(a) local low-rank structure | (b) decay of residual (local coherence) |
3 Exploiting local structure via online adaptive basis updates
We propose AADEIM (adaptive bases and adaptive sampling) to exploit local structure to construct online adaptive reduced models of full models that exhibit transport-dominated behavior. We focus on two types of local structure: Local low-rankness that we exploit via online adaptive bases and local coherence that enables updating the bases computationally efficient via few samples from the full model. Section 3.1 describes local low-rankness and local coherence in more detail. Section 3.2 discusses the adaptation of the DEIM basis to exploit local low-rank structure and Section 3.3 shows that only few samples are necessary to derive basis updates if the adapted DEIM spaces are locally coherent. Algorithm 1 in Section 3.4 summarizes our approach and provides practical considerations.
In most of this section, we drop the dependence on the parameter of the function , the state at time step , and the trajectory , as well as their reduced counterparts. Parametrization of our approach AADEIM is discussed in Section 3.4.
3.1 Local structure in transport-dominated problems
Consider the toy example introduced in Section 2.2. Let be a window size and consider the local trajectory at time step , which consists of the states from time steps to time step . Figure 2a compares the singular values of the local trajectory to the singular values of the global trajectory for and . The results indicate that the singular values of the local trajectory decay orders of magnitude faster than the singular values of the global trajectory . We call this behavior—that the singular values of local trajectories decay fast while the singular values of the global trajectory decay slowly—local low-rank structure. By adapting the DEIM spaces, we will exploit this local low-rank structure in Section 3.2. Similar observations about local low-rank structure are exploited in [27, 48, 33, 31, 32, 21, 37], cf. Section 1. In Appendix A, we analyze analytically an example to demonstrate its low-rank structure.
Let us now consider the local DEIM space of dimension and the corresponding local interpolation points matrix obtained from the local trajectory . The residual of approximating the state at time step with DEIM in is
Figure 2b shows the decay of the sorted squared components of . The decay is fast, which means that there is a high residual locally only, i.e., the residual at a few components dominates whereas at most of the components of the residual is low. Intuitively, such a fast decay of the residual means that the basis matrix of the local DEIM space needs to be corrected at a few components only. We will show that such a fast decay of the residual can be the result of a local coherence structure of the local DEIM spaces, which we will make precise and will exploit with adaptive sampling in Section 3.3.
3.2 Exploiting local low-rank structure: Basis updates
To exploit local low-rank structure as described in Section 3.1, we adapt the DEIM spaces and the DEIM interpolation points during the time steps in the online phase. The adaptation is initialized with the DEIM basis matrix and interpolation points matrix at time step . Then, at each time step , the DEIM basis matrix is adapted via an additive low-rank update to . The update is based on ADEIM [39]. The interpolation points matrix is derived with QDEIM from the adapted basis matrix . In the following description, the DEIM interpolant is adapted at each time step for ease of exposition, even though all of the following directly applies to situations where the adaptation is performed at selected time steps only, e.g., every other time step or via a criterion that decides adaptively when to update the DEIM interpolant.
3.2.1 Adaptation with ADEIM
Let be the current time step and and the DEIM basis matrix and the DEIM interpolation points matrix, respectively. Consider the matrix and the coefficient matrix . The residual of the DEIM approximation of the columns of is . Let now be the sampling points matrix corresponding to the samplings points with . ADEIM [39] adapts the DEIM basis matrix to with a low-rank update
with with rank . The ADEIM update is the rank- matrix that minimizes the residual at the sampling points in the Frobenius norm, which means that the ADEIM update minimizes
see [39] for details on how to compute and the computational costs of computing the update.
Critical for the adaptation is the matrix , because ADEIM adapts the space such that the residual of approximating the columns of is minimized at the sampling points. A potentially good choice for the columns of would be the states of the full model at time steps ; however, the states of the full model are unavailable because their availability would mean the full model has been solved. Instead, we set the columns of as follows. Let be the complementary sampling points matrix derived from the points that have not been selected as sampling points. Let further be the vector with
(4) |
which means that the components of corresponding to the sampling points in are equal to the components of and all other components are approximated via DEIM given by and . Note that and therefore the Moore-Penrose inverse is used in (4), instead of the inverse when and are used. Note further that is the reduced state at time and that is the function that defines the full model (1). The matrix that we use in the following for adaptation is
The vectors serve as surrogates of the full-model states to which the DEIM space is adapted, because the full-model states satisfy for , see equation (1). Computing a vector requires evaluating the full-model function at the sampling points .
3.2.2 Analysis of the error of the adapted space
We now provide an analysis of the ADEIM adaptation. Let and be two -dimensional subspaces of . We measure the distance between and as
(5) |
where and are orthonormal basis matrices of and , respectively. The distance is symmetric and invariant under orthogonal basis transformations, which is true because holds. The -dimensional subspace of to which we want to adapt at iteration is denoted as and the adapted space is . The following lemma quantifies the reduction of the residual after an ADEIM update and establishes Proposition 7 that bounds the error of the adapted space with respect to the space .
Let be the coefficient matrix and the residual matrix. Let further be the sampling points matrix and the adapted basis matrix of the adapted space . Let be the rank of and let with be the rank of the update . Then,
where are the singular values of .
Proof.
Follows from [39, Lemma 3.5] by transforming the generalized symmetric positive definite eigenproblem into the symmetric eigenproblem with matrix
, which then is equivalent to computing the singular value decomposition of
so that the squared singular values ofare equal to the eigenvalues of the generalized eigenproblem. This shows this lemma by using the last identity of
[39, proof of Lemma 3.5]. ∎Consider the same setup as in Lemma 3.2.2 and additionally assume that has rank and its columns are in the -dimensional space . Let further be the adapted space derived with the rank- ADEIM update and sampling points matrix . Then, it holds
(6) |
with
(7) |
where denotes the complementary sampling matrix of and is the minimal singular value of .
Proof.
First, note that the matrix has zero entries only, because the ADEIM update updates only rows of corresponding to the sampling points , see [39, Lemma 3.5]. Thus, it holds
(8) |
Consider now the norm of the residual with respect to the adapted space
(9) | ||||
(10) | ||||
(11) | ||||
(12) |
Equation (9) follows because is the complementary sampling points matrix of . Equation (10) follows from Lemma 3.2.2. Equation (11) holds because of (8). Equation (12) uses and that . Consider now the projection of the columns of onto and observe that
(13) |
Note that the basis matrix obtained with the ADEIM update is not necessarily orthonormal and therefore it might be necessary to consider the oblique projection onto the column span of in (13); however, the projection error in the Frobenius norm is the same for both projections. With (12) and the definition of follows
(14) |
Let now be an orthonormal basis matrix of . Since is spanned by the columns of , there exists a full-rank matrix such that . We obtain
(15) | ||||
where we used , which holds because is orthonormal. The matrix has rank and therefore the smallest singular value is positive. Combining (15) with (14) and the definition of shows (6).
∎
3.3 Exploiting local coherence: Adaptive sampling
Proposition 7 shows that the choice of the sampling points influences the bound of the error of the adapted space . In this section, we derive an adaptive sampling strategy that minimizes the upper bound derived in Proposition 7 in case of full-rank ADEIM updates. Then, we show that if our adaptive sampling strategy is used, the error decays at least as fast with the number of sampling points as the norm of the DEIM residual, which in turn means that only few sampling points are necessary to adapt the basis if the residual decays fast.
3.3.1 Adaptive sampling strategy based on residual
Following Proposition 7 and the decay factor defined in (7), we select the sampling points so that is minimized. Note that in (6) is independent of the sampling points and therefore it is sufficient to derive sampling points that lead to a small decay factor .
Consider the component-wise residual
(16) |
and let be an ordering of such that . Then, select the first components as sampling points and form the corresponding sampling points matrix
(17) |
If a full-rank ADEIM update is applied, i.e., in Proposition 7, then this choice of sampling points is optimal in the sense that the bound is minimized.
We now show that a fast decay in the residual implies a fast decay in the error bound of .
Consider the same setup as in Proposition 7. Let be an ordering of such that the component-wise residual defined in (16) decays as
(18) |
with rate and constants . The error of the adapted space is bounded as
(19) |
if a full-rank ADEIM update is applied and sampling points are selected via the adaptive sampling (17). The constant is independent of . In particular, setting the number of sampling points to
(20) |
guarantees for a threshold .
Proof.
3.3.2 Local coherence
Following, e.g., [9], define the coherence of a space as
where is an orthonormal basis of and is the -th canonical unit vector. Define further the local coherence as
for , see, e.g., [28, 60]. We refer to [10, 9, 28, 60] for details.
We now show that the rate of the decay of the local coherence of the current DEIM space and of the space to which we want to adapt is reflected in the decay of the bound of the error of the adapted space with respect to the number of sampling points . Thus, we now relate the decay of the error of the adapted space to a property of the spaces and , namely their decay of the local coherence.
Let be an ordering of such that
(22) |
holds for , with and . Let the columns of be in . Define , then for the residual holds
(23) |
with a constant that is independent of .
Proof.
Denote with , , and the -th row of , , and , respectively. The matrix is an orthonormal basis matrix of . Let further be a matrix such that . Note that , because is orthonormal. Then, with , follows
Now make the following estimate
because . Further, we have
Set now and to bound
Now set to obtain
which shows the proposition. ∎
Since we consider subspaces of finite-dimensional spaces only, i.e., is finite, the bounds in (22) hold for any subspace by increasing the constants and choosing the rate close to 1. However, Lemma 23 still is meaningful because it shows that the constants and rates that appear in (22) are obtained in the bound of the residual in (23) as well. Thus, the local coherence structure is directly reflected in the decay of the DEIM residual.
The following proposition combines the local coherence structure exploited in Lemma 23 and the adaptive sampling of Proposition 20 to derive the number of sampling points that are required to achieve in probability.
Assume that Lemma 23 applies and consider the same setting as in Proposition 7 except that with being an matrix with independent and identically distributed standard Gaussian entries. Then, with probability at least if a full-rank update is applied and
(24) |
where is a constant independent of , and .