## 1 Introduction

Physical simulation is the key to the developments of science, engineering and technology. Various physical processes are mathematically modeled by time-dependent nonlinear partial differential equations (PDEs). Since analytical solutions to such problems are not available in general, one has to resort to numerical methods to effectively approximate the solution. State-of-the-art numerical methods have been proven successful in obtaining accurate approximations of the groundtruth observations in various application problems. However, subject to the complexity and the scale of the problem domain, the computational cost of such numerical methods could be prohibitively high. Even with high-performance computing, a single forward simulation could take a very long time. Yet, multiple forward simulations are typically required in some real world decision-making applications such as design optimization

[wang2007large, de2020three, de2018adaptive, white2020dual], optimal control [choi2015practical, choi2012simultaneous], uncertainty quantification [smith2013uncertainty, biegler2011large], and inverse problems [galbally2010non, biegler2011large], which make such problems computationally intractable.Constructing a reduced order model (ROM) is a popular and powerful computational technique to obtain sufficiently accurate numerical solutions with considerable speed-up compared to the corresponding full order model (FOM). Various model reduction schemes have been proposed. Many of them seek to extract an intrinsic solution manifold using a condensed solution representation. Depending on how these representations are constructed, two major approaches exist– linear subspace reduced order models (LS-ROM) and nonlinear manifold reduced order models (NM-ROM). In either case, the governing equations are projected onto the solution manifold as part of the reduction strategy, and therefore these approaches are referred to as projection-based reduced order models (PROMs). This differs from other ROM techniques such as interpolation or data fitting.

In LS-ROM, the reduced basis vectors are obtained through, for example, proper orthogonal decomposition (POD). The number of degrees of freedom is then reduced by substituting the ROM solution representation into the (semi-)discretized governing equation. This approach takes advantage of both the known governing equations and the solution data generated from the corresponding FOM simulations to form LS-ROM. Example applications include, but are not limited to, nonlinear diffusion equations

[hoang2020domain, fritzen2018algorithmic], Burgers equation and the Euler equations in small-scale [choi2019space, choi2020sns, carlberg2018conservative] and large-scale, convection–diffusion equations [mojgani2017lagrangian, kim2021efficient], the Navier–Stokes equations [xiao2014non, burkardt2006pod], the compressible Euler equations in a moving Lagrangian frame [copeland2022reduced, cheung2022local], rocket nozzle shape design [amsallem2015design], flutter avoidance wing shape optimization [choi2020gradient], topology optimization of wind turbine blades [choi2019accelerating], lattice structure design [mcbane2021component], porous media flow/reservoir simulations [ghasemi2015localized, jiang2019implementation, yang2016fast, wang2020generalized], computational electro-cardiology [yang2017efficient], inverse problems [fu2018pod], shallow water equations [zhao2014pod, cstefuanescu2013pod], Boltzmann transport problems [choi2021space], computing electromyography [mordhorst2017pod], spatio-temporal dynamics of a predator–prey system [dimitriu2013application], acoustic wave-driven microfluidic biochips [antil2012reduced], and the Schrödinger equation [cheng2016reduced]. However, in advection-dominated problems, the intrinsic solution space cannot be approximated by subspaces with a small dimension, i.e., the solution space with slowly decaying Kolmogorov-width. As an alternative to LS-ROM, we can replace the linear subspace solution representation with a nonlinear manifold. This type of ROM is known as NM-ROM. A neural network-based reduced order model is developed in

[lee2020model] and extended to preserve the conserved quantities in the physical conservation laws [lee2019deep]. Recently, Kim, et al., [kim2022fast, kim2020efficient]have achieved a considerable speed-up with NM-ROMs via autoencoder.

The main advantage of a ROM (either LS-ROM or NM-ROM) is to reduce the computational cost by using a low-dimensional structure for representation of state variables. However, in nonlinear systems of PDEs, the performance of ROM is degraded due to the bottleneck issue of lifting to FOM size. That is, the nonlinear terms need to be evaluated in every time step as the state variables evolve in the time marching process. Since such evaluation scales with the FOM size, we cannot expect any speed-up without special treatment even if reduced representation is used to approximate the state variables. To overcome this issue, a hyper-reduction technique [ryckelynck2005priori] is used to efficiently evaluate the nonlinear source terms by approximation. We note that a majority of the aforementioned literature achieved a true speed-up by applying a hyper-reduction technique. The key idea is to approximate the nonlinear terms using a small number of basis vectors, while keeping the number of evaluations of nonlinear terms as small as possible. To this end, the hyper-reduction technique requires one to strategically select a set of indices that leads to accurate approximations to the nonlinear terms.

One of the most well-known selection algorithms is the Discrete Empirical Interpolation Method (DEIM) [chaturantabut2010nonlinear]

, which aims at minimizing the operator norm from the error estimate (Lemma 3.2 of

[chaturantabut2010nonlinear]). DEIM is implemented through a greedy algorithm that sequentially selects one index at a time with respect to a certain criterion. Carlberg, et al., [carlberg2013gnat] and [carlberg2011efficient] extend this idea to allow oversampling. Q-DEIM is introduced in [drmac2016new] as a new framework for constructing the DEIM-related operator via the QR factorization with column pivoting. The stability and oversampling of DEIM is also investigated in [peherstorfer2020stability].The goal of this paper is to introduce the S-OPT sampling method [shin16] as a hyper-reduction technique in ROMs, and compare its performance with those by the DEIM family algorithms. S-OPT was first developed by Shin and Xiu in [shin16] as a points selection algorithm for least-squares based stochastic-collocation methods in Uncertainty Quantification (UQ) [shin2016near, narayan2017christoffel, hadigol2018least, guo2020constructing]. In the context of UQ, the goal is to find the best subset of points that yields the most accurate least-squares solution, which is closely aligned with the goal of the hyper-reduction. S-OPT aims to find a set of indices (rows or points) that maximizes the quantity [OptimalDesign] (to be introduced in Section 3.1.2). The quantity measures both the mutual column orthogonality and the determinant. The S-OPT algorithm is fundamentally different from the DEIM algorithm. The core principle of DEIM lies at maximizing

the smallest singular value

(spectral norm) of the underlying projection matrix, while S-OPT seeks to maximize both the product of all the singular values (determinant) and the column orthogonality of the underlying projection matrix. We employ the S-OPT algorithm as an index selection operator for hyper-reduction.### 1.1 Paper organization

The FOM is described in Section 2.1 to introduce some background information and notation. We then describe the PROM formulation in Section 2.2, which leads to the description of the hyper-reduction procedure in Section 3. The sampling algorithms are part of the hyper-reduction procedure, so they are described in Section 3.1. Specifically, a DEIM algorithm with oversampling is outlined in Section 3.1.1, and the S-OPT algorithm is described in Section 3.1.2. The performance and comparison of the two algorithms are presented in Section 4 using four examples: 1D Burgers problem, 2D laminar viscous flow around airfoil, and two hydrodynamics examples, i.e., a 2D Gresho vortex problem and a 3D Sedov blast problem. The paper is concluded with summary and discussion in Section 5.

## 2 Problem Formulation

We start by defining the FOM and some notation used throughout the paper. We use two different PROMs in the example problems. First, we outline the LS-ROM formulation, followed by NM-ROM formulation for a general ordinary differential equation (ODE). This section precedes the main contribution of this paper, which is a sampling algorithm used with both of the PROM formulations.

For the rest of the paper, is understood as either the standard Euclidean norm or the spectral matrix norm.

### 2.1 Full Order Model

Consider a system of nonlinear ODEs resulting from the semidiscretization of a system of PDEs in the space domain

(1) |

where denotes time, denotes the state vector of dimension N, denotes the initial condition, denotes a vector of parameters defining the operating point of interest within the parameter domain , denotes a nonsingular matrix, and : is a nonlinear function and boundary conditions. The parameter and time dependence of and variables are dropped in the rest of the paper for notational simplicity, and are implied. Furthermore, the dot notation denotes the derivative with respect to time. The FOM system in Eq. (1) can be written in a residual form as follows:

(2) |

The time derivative term above can be approximated by various time integration schemes. Suppose the temporal domain is partitioned by , where is the number of subintervals,

denotes a discrete moment in time with

, , and for . Throughout the paper, we use a superscript to denote the time-discrete counterpart of a function evaluated at . For the numerical experiments, we use the implicit Backward Euler (BE), method and the second order explicit Runge–Kutta average (RK2-average) method to numerically solve Eq. (1), but other numerical time integration schemes are also applicable. For example, the BE method solves for at the -th time step in Eq. (3):(3) |

where and is the -th time. The residual function with the BE time integrator is then defined by

(4) |

Although we continue the discussion using the BE time integrator and its residual as an example of the demonstration, the residuals of other types of time integrators can replace in a similar fashion. For example, we refer to [copeland2022reduced] for the residual of the RK2-average method.

### 2.2 Projection-Based Reduced Order Model

A PROM formulation relies on the concept that full state solutions can be represented in lower-dimensional manifold. As such, a PROM projects the governing equations to a manifold, resulting in lower-dimensional equations. If the manifolds for the solution field and the equations are the same, then the projection is called Galerkin. On the other hand, if the manifolds for the solution field and the equations are different, then the projection is called Petrov–Galerkin. Both the Galerkin and Petrov–Galerkin projection methods are considered. We also consider a PROM with a linear subspace solution representation (LS-ROM), as well as a model with nonlinear manifold solution representation (NM-ROM).

#### 2.2.1 Linear Subspace Reduced Order Model

A LS-ROM reduces the spatial dimension by approximating the full solution using a subspace with , also called a trial subspace. The approximation of the full solution is

(5) |

where is a reference state, denotes a basis matrix whose -th column is , and is a vector of unknown generalized coordinates.

The reduced subspace, , is commonly found using POD [pod]

, which is related to principal component analysis (PCA) in statistical analysis and the Karhunen-Loève expansion in stochastic analysis

[hotelling1933analysis, loeve1955]. While we use POD to define the basis in this paper, the basis can be built using other options, such as Fourier modes. In POD, a set of basis functions are built by performing a singular value decomposition (SVD) over a solution snapshot matrix. These snapshots are based on solutions of the FOM, either steady-state solutions for multiple parameters, or time-varying solutions. Additionally, the reference state,

, can be found by taking the average of the collected snapshots.Substituting of Eq. (5) for the full solution in Eq. (1) results in a system of equations with fewer unknowns,

(6) |

where denotes the time derivative of the generalized coordinate . Since and are fixed, let . Note that is linear in . By projecting the system of equations onto a test subspace with basis matrix (which can be different than ) whose -th column is , that is,

we solve for , which gives the governing equation for

(7) |

Equation (7) corresponds to Galerkin projection when the test subspace is the same as the trial subspace, i.e. . When the test subspace differs, we have Petrov–Galerkin projection. However, Petrov–Galerkin projection is generally applied after discretizing in time, which leads us to describe the nonlinear Least-Squares Petrov–Galerkin (LSPG) projection procedure. A LSPG ROM substitutes for into Eq. (4) and minimizes the residual at each time instance. Using the BE time discretization as an example, we have

(8) |

where the LS-ROM backward Euler reduced residual is defined as

(9) |

The necessary first-order optimality condition for Eq. (8) is

Here is the Jacobian of . This shows that Eq. (8) corresponds to a Petrov–Galerkin projection of the FOM equations with a trial subspace and a test subspace , hence the name LSPG projection.

A note on notation: throughout this paper, the hat () is used for reduced variables that lie in a smaller subspace such as , and the tilde () is used for variables in that approximate their non-accented counterparts.

#### 2.2.2 Nonlinear Manifold Reduced Order Model

The LS-ROM relies on a linear subspace () for the solution manifold. In this section, we outline a ROM with nonlinear solution representation. As a generalization of the linear subspace representation in (5), NM-ROM seeks approximation of the full state solutions in a trial manifold by

(10) |

where denotes an approximation of the full solution, denotes a reference state, denotes a nonlinear function, and denotes a vector of unknown latent variables. The nonlinear function is found by neural network training as a decoder in an autoencoder architecture. In this work, we adopt the shallow masked autoencoder as in [kim2022fast], where the nonlinear function is a scaled decoder with one single hidden layer and a sparsity mask in the output layer. The trainable parameters, i.e. the weight and bias in the decoder and the encoder networks, are optimized against the mismatch between the original training data and the corresponding autoencoder output. The trained decoder and its Jacobian, , is used to formulate the system of equations. More precisely, the counterpart of (6) is

The NM-ROM Galerkin projection is then given by

Similarly, the NM-ROM LSPG projection is given by

where the NM-ROM backward Euler reduced residual is defined as

A note on time integrators: although the BE method is used for the discussion of LS-ROM and NM-ROM, other numerical time integrators can be used in a similar fashion.

## 3 Hyper-Reduction and Sampling

Due to the term still being nonlinear in the reduced subspace, the requirement to compute the Jacobian, and to use the Jacobian in a matrix-matrix multiplication for LSPG, the reduced, low-dimensional equations may still not be computationally more efficient than the FOM. However, the ROM framework allows for a further approximation of the equations using a hyper-reduction strategy.

Specifically, when Galerkin projection is utilized, the part of Eq. (7) is time-independent, and can be computed offline for each parameter . This leaves the nonlinear function to be calculated at every iteration. However, still scales with the full dimension. Furthermore, for Petrov–Galerkin projection, the projection matrix must be calculated at every iteration because the Jacobian relies on the current state. In the case of NM-ROM, both projection methods rely on the Jacobian. The full dimension dependency in the Jacobian and the nonlinear function limits the speed-up performance of the PROM.

There are two ways hyper-reduction can be applied. Firstly, the nonlinear term is approximated so that it scales with the reduced dimension. This is applicable when the projection matrix can be computed offline, so approximating only the nonlinear term results in sufficient computational savings. Secondly, the residual is approximated instead of independently working with the nonlinear term. This is especially favorable in cases where the nonlinear term may be difficult to access in the FOM solver, such as for residual minimization solvers. Either way, the aforementioned terms are approximated by using only a carefully constructed index subset and omitting gaps for ignored entries, so this procedure is called gappy tensor approximation, as first introduced in

[everson1995].Hyper-reduction on . The nonlinear term can be approximated in a least-squares sense from a gappy form using a reduced basis built for the nonlinear term and the set of sampled indices, . Given a gappy tensor , the approximation of is:

(11) |

where is a sampling matrix which contains columns of the identity matrix, and is the standard basis in . In other words, samples at only indices. Choosing which indices to select is the subject of Section 3.1. The (minimum norm) solution to Eq. (11) is where the superscript denotes the Moore-Penrose pseudoinverse. A hyper-reduction of the nonlinear term is then given as

(12) |

which is termed as “gappy POD” hyper-reduction. We remark that, while is an orthogonal projection onto the column space of , the sampling procedure introduces an oblique projection .

The procedure that combines Galerkin projection and gappy POD for the nonlinear term is known as the discrete empirical interpolation method (DEIM). In the original DEIM paper [chaturantabut2010nonlinear], was computed by collecting snapshots of the nonlinear term and computing the POD. However, it has also been shown that solution snapshots can be used through the subspace relation in [choi2020sns], i.e., .

One can consider a simple approximation to , which only requires . This is termed “collocation.” Even in this case, however, may still be used to build during the offline phase, yet is not used during the online phase of the ROM. This is justified by the assumption that is well represented by the subspace spanned by .

Hyper-reduction on the residual term. Given , , and , the nonlinear term (defined in Eq. (9)) is similarly approximated by a gappy tensor using a reduced basis built for the residual and a set of sampled indices, :

(13) |

where is the sampling matrix constructed from . The (minimum norm) solution to Eq. (13) is and the corresponding hyper-reduction of the residual term is

(14) |

which is the gappy-POD approach to hyper-reduction. Similarly as before, one can also use the collocation approximation . The procedure that combines LSPG projection and gappy POD hyper-reduction for the residual term is called the Gauss-Newton with approximated tensors (GNAT) procedure [carlberg2011efficient].

Error analysis of hyper-reduction approximation. In practice, the sampling matrix is not built; rather, the selected indices are maintained along with the corresponding rows of and . Regardless of which approach is used, the goal is to approximate a vector of size (either or ) using a predefined basis (either , or , ). Hence, we are faced with the following optimization problem: Find the optimal sampling matrix such that

(15) |

where and . In Theorem 1, we quantify and estimate the oblique projection error , which becomes the theoretical basis of many existing sampling methods.

###### Theorem 1

Let be a sampling matrix and be a basis matrix of full rank with . Let and let . Suppose is of full rank. Then,

(16) |

where is a QR factorization of , is the projection of onto the orthogonal complement of the column space of , and is the solution to satisfying Furthermore,

(17) |

###### Proof

Let . Observe that the quantities stated in Theorem 3.1 of [shin16] are , respectively, in the notation of the current paper. It then follows from Theorem 3.1 of [shin16] that . Since , we have

which gives (16). Furthermore, it follows from and the above equality that

Since is a projection operator, the proof is completed by observing .

###### Remark 1

###### Remark 2

Theorem 1 requires neither the column orthonormality of nor the number of samples being the same as the number of columns of . Hence, it can be viewed as a generalization of Lemma 3.2 of [chaturantabut2010nonlinear]. For a particular case of , since , the error bound of (17) becomes Equation 3.8 of [chaturantabut2010nonlinear].

### 3.1 Sampling Algorithms

One of the most popular algorithms for the construction of the sampling matrix is based on the error bound of (17). These algorithms aim at constructing that makes as small as possible. This may be viewed as E-optimality [pukelsheim2006optimal]

in the optimal design community, which maximizes the smallest nonzero eigenvalue of

. The commonly used algorithms in the PROM community, such as DEIM [chaturantabut2010nonlinear], Q-DEIM [drmac2016new], follow this principle.In the UQ community, a similar problem has been addressed in the context of least squares based stochastic collocation [shin16, shin2016near, narayan2017christoffel, hadigol2018least, guo2020constructing]. In particular, [shin16] proposed a method based on the error equality of (16). Since the true optimum is not available (as also mentioned in Remark 1), [shin16] developed the so-called S-optimality [OptimalDesign] that maximizes both the column orthogonality of and the determinant . Maximizing the column orthogonality of can increase the numerical stability, which has been an issue for E-optimality based sampling methods, such as DEIM. In this paper, we refer to this method as “S-OPT”.

The goal of this paper is to introduce the S-OPT sampling method to the PROM community as a hyper-reduction method, and compare its performance with those by the DEIM family algorithms.

In what follows, the following notations are used. Let

be an orthogonal matrix obtained from a QR factorization of

if the columns of are not orthonormal, and let otherwise. Given a set of indices and a vector , the th component of is denoted by .#### 3.1.1 Oversampled DEIM Algorithm

The oversampled DEIM algorithm used in this paper differs from the original DEIM algorithm [chaturantabut2010nonlinear] and more closely follows the sampling method from the LSPG paper (Algorithm 5 in [carlberg2011efficient]). Unlike the original DEIM, this allows for oversampling, i.e., selecting more samples than the number of POD modes, i.e., . We call the algorithm “oversampled DEIM” in this paper, while it could be fairly named a greedy or gappy-POD algorithm, among other options. This algorithm has also been modified for space-time LSPG in [choi2019space].

By closely following notation of [washabaugh], we present the pseudo-code for the oversampled DEIM in Algorithm 1. Starting with an appropriate basis as an input, as well as the desired number of samples , the algorithm selects indices based on a greedy method. The first selection is simply the index of the largest absolute value entry of . The algorithm then loops over the rest of the columns of , retaining all prior columns within . Each column of is approximated using a gappy POD reconstruction (line 8), and the index at which there is the largest error is selected and included in . This loop continues until the sampled set contains the desired number of unique indices. If necessary, the set of total indices (sampled and required neighboring nodes) is also populated according to the numerical scheme.

#### 3.1.2 S-OPT: A Quasi-Optimal Points Selection Algorithm

This subsection introduces the sampling method (S-OPT) proposed in [shin16]. The underlying principle of S-OPT is to find a sampling matrix that makes as small as possible. According to Theorem 1, should satisfy

(19) |

If the sampling matrix is constructed to make preserve the same column orthogonality as , the right-hand side of (19) becomes zero, leading to . However, this will only be the case when is the identity matrix with and in general, will be nonzero. With the goal of having a small , the S-OPT [shin16] seeks to (a) maximize the column orthogonality of so that the right-hand side of (19) is minimized, and (b) maximize the determinant of so that the nonzero solution of (19) is small. A quantity denoted by (the precise definition is given in (20)) that measures the mutual column orthogonality and the determinant was developed in [shin16]. The S-OPT aims at finding a sampling matrix that maximizes the quantity .

Since the true optimality of (18) is not achievable in practice, the selection method based on the S-optimality (20) was termed “quasi-optimal” in [shin16] and “near-optimal/S-optimality” in [shin2016near]. Here, we simply refer to it as “S-OPT.”

Let be an matrix and be its th column. Assuming for all , is defined to be

(20) |

It was shown in [shin16] that if and only if the columns of are mutually orthonormal. Hence, maximizing enforces both mutual column orthogonality and a larger determinant.

In the context of hyper-reduction, the S-OPT seeks the optimal sampling matrix that maximizes , i.e.,

Solving the above optimization problem requires one to compute multiple evaluations of . The evaluation of , however, can be expensive as it requires the computation of determinants. However, [shin16] presented an efficient way of evaluating without computing determinants, based on a greedy algorithm and the Sherman–Morrison formula. The pseudo-code for the S-OPT index selection is presented in Algorithm 2 for the reader’s convenience. We refer to the original paper [shin16] for full algorithmic details.

## 4 Main results

In this section we show four example problems using both sampling algorithms. They are a 1D Burgers equation problem, an aerodynamic laminar flow around airfoil, and two Lagrangian hydrodynamic problems.

The 1D Burgers equation compares the use of the sampling methods for reduced order models built with linear or nonlinear subspaces; that is, LS-ROM and NM-ROM. Meanwhile, the laminar airfoil problem builds a parametric LS-ROM using shape parameters. The airfoil problem also explores how truncating the POD basis may affect the results when using either sampling algorithm. The final two problems are the 2D Gresho vortex and the 3D Sedov blast Lagrangian hydrodynamics problems. In these examples, we show that the sampling methods result in different error bounds, and we further examine the performance of the two sampling algorithms for varying the number of sampled indices.

Furthermore, the examples in this paper use different combinations of projection and hyper-reduction methods. Before presenting the problem descriptions and results, we present the specific methods used for each of the four examples here, for comparison and completeness purposes.

While the examples in Sections 4.1 and 4.2 both apply LSPG projection, the example in Section 4.1 applies gappy POD hyper-reduction (or the GNAT procedure), and the airfoil example in Section 4.2 applies collocation hyper-reduction for the residual term. Finally, the two examples in Section 4.3 use the DEIM approach; i.e., a Galerkin projection with a gappy POD approximation of the nonlinear term.

Applying LSPG with gappy POD hyper-reduction, as done in Section 4.1, gives

(21) |

which aims at minimizing the approximation (14) of .

Applying LSPG with collocation hyper-reduction, as done in Section 4.2, gives

(22) |

In Equations (21) and (22), the fully discretized residual is used as an example, and a residual defined using a different discretization can be used in a similar fashion.

And lastly, applying Galerkin projection with gappy POD hyper-reduction of the nonlinear term, as done in Section 4.3, gives

(23) |

In the case of Galerkin projection, remember that the projection matrix is taken to be the solution basis . The front matter can be pre-computed once, and the nonlinear term can be evaluated at only the selected indices. Equations (21, 22, 23) no longer scale with the full dimension N, but rather with the number of selected indices and the reduced dimension k.

Finally, it is important to note that computing the residual at the selected nodes may depend on the solution at neighboring nodes. For example, for a finite volume discretization involving inviscid fluid dynamics (Euler) equations (e.g., Section 4.2) as well as the finite element method (e.g., Section 4.3), the solution at the neighboring nodes is required to compute the flux, and then the residual at the selected node can be computed using the flux. For this reason it is necessary to maintain the indices of all required neighboring nodes, in addition to the selected nodes.

### 4.1 1D Burgers Equation

As the first example, we consider the 1D inviscid Burgers equation

(24) |

with the periodic boundary condition for and the initial condition . We decompose the spatial domain into a uniform mesh with mesh size , which consists of grid points , at which the discrete solution is defined by . The semi-discretization of the Burgers equation is given by:

(25) |

where is the coefficient vector, denotes the entry-wise product, and

(26) |

To obtain a fully discrete scheme, we similarly decompose the temporal domain into subintervals to form a uniform mesh with mesh size , and use a backward difference to numerically approximate the temporal derivative. Figure 1 shows the initial condition and the final-time solution for 1D Burgers equation. It can be seen that a shock is eventually developed at .

In [kim2022fast], a linear-subspace reduced order model (LS-ROM) and a nonlinear-manifold reduced order model (NM-ROM) are developed for (25), where DEIM is used for hyper-reduction in both approaches. In what follows, we compare the numerical results of reduced order models using the sampling algorithms in Section 3.1, i.e. the oversampled DEIM and S-OPT. The maximum-in-time relative error for the ROM solution is measured against the corresponding FOM solution , which is defined as:

(27) |

All the 1D Burgers equation simulations in this
subsection use Lassen in Livermore Computing Center^{1}^{1}1High
performance computing at LLNL, https://hpc.llnl.gov/hardware/platforms/lassen,
on Intel Power9 CPUs with 256 GB memory and NVIDIA V100 GPUs,
peak TFLOPS of 23,047.20, and peak single
CPU memory bandwidth of 170 GB/s.

In Figure 2, we depict the maximum-in-time relative discrete error of the solution against the number of sampling indices for both sampling algorithms. The number of sampling indices takes value within , where in LS-ROM and in NM-ROM. In LS-ROM, despite only a minor difference in the error, the DEIM algorithm yields more oscillation in the error with respect to the number of sampling indices. In NM-ROM, the oscillation is even more severe in DEIM. In other words, the S-OPT algorithm yields more stable results in the Burgers equation with increasing number of sampling indices. Figure 3 and Figure 4 illustrate the selected nodes by the sampling algorithms in LS-ROM and NM-ROM respectively. In both cases, both S-OPT and DEIM select nodes around the expended shock wave, where the nonlinearity occurs, in priority. As the oversampling number increases, S-OPT tends to sample nodes in a more widespread manner, while DEIM still tends to densely select nodes close to the shock, as shown in Figure 1.

### 4.2 2D Laminar Airfoil

The next example considers a 2D laminar airfoil using the steady compressible Navier-Stokes equations with a low Reynolds number (i.e., no turbulence). For the steady problem, we define the following residual definition, which is used in the construction of a LS-ROM:

(28) |

where is the convective flux, and is the viscous flux. For this 2D case, the conservative variables are given by with the velocity vector defined as , denoting the fluid density, and E denoting the total energy per unit mass. Furthermore, and are defined as:

(29) |

where is the pressure field, is the heat flux, and is the viscous stress tensor defined as:

(30) |

with as viscosity. The airfoil surface is set as an adiabatic wall (zero heat flux) and the far field is set to the free stream conditions.

The FOM is solved using a pseudo-time stepping method, specifically an implicit, local time stepping method, to march the solution forward from the initial condition (freestream fluid state) to the steady state solution. The local time stepping technique allows for quicker convergence by allowing the time step to vary between elements. This approach is typical for steady aerodynamics. We use converged steady solutions as simulation data to construct the reduced basis.

The full order model domain contains 8741 mesh points with a NACA0012 airfoil surface in the center. We use shape parameters for a parameterized ROM study. In this case, the airfoil shape is slightly altered using three Hicks-Henne bumps [hhbumps]

on the upper surface and three on the bottom surface. Hicks-Henne bumps are smooth bump formulations, and each shape parameter defines the amplitude of the Hicks-Henne bump. The six-dimensional parameter space is sampled using 73 Latin hypercube samples, resulting in 73 total snapshots. Both the FOM and ROM equations are solved using the open-source CFD code SU2

[su2], aided by the libROM code [librom] from Lawrence Livermore National Lab for the snapshot collection and POD computation.The laminar airfoil problem is a good academic problem for quick testing, because it does not require a large amount of computational resources. The 73 snapshots are collected on the Sherlock cluster operated by the Stanford Research Computing Center. Then the reduced order model simulations are run serially on a macOS laptop with 2.4 GHz Intel Core i9 processor and 16 GB memory.

The POD basis has a dimension , but is then truncated to 20, 10, and 5 POD modes. Figure 5 shows the first four modes, or singular vectors, for the density field from the POD computation. The variations of the flow field due to the airfoil shape differences are captured by the POD modes, so the flow solution for any set of shape parameters can be approximated using a combination of these modes.

Both sampling methods are tested and the results are shown in Figure 6. In all cases, the goal of the ROM is to predict the baseline NACA0012 airfoil shape, which is not included in the 73 snapshots.

It is also important to note that for finite-volume CFD models there are multiple equations at each node, so the input basis for the sampling algorithms needs to be condensed so that the size of the first dimension and the number of nodes are the same. We do this by taking the norm of the basis values at each node, resulting in one basis value per node. This is in contrast to Section 4.3, where we also show a different scenario of building a separate basis for a different field, and therefore, merging the sampling indices from different fields.

The measurements of error shown in Figure 6 are defined as follows for the error and the maximum relative error:

(31) |

where the truth values are obtained from the full order model solution of the prediction case (the NACA0012 baseline shape) using SU2.

In Figure 6 we see that as the number of sampled indices decreases, the S-OPT algorithm outperforms the DEIM algorithm as shown by the error measures. At the most extreme level of hyper-reduction, using only 437 samples, the ROM built with S-OPT algorithm and 5 POD modes performs better than the ROM built with DEIM and 20 POD modes.

However, there is a cost associated with the performance improvement. The ROMs built with the S-OPT algorithm generally take longer to converge. Figure 7 shows one reason the S-OPT algorithm ROMs take longer. Since this is a viscous problem, two levels of node-neighbors are required to compute the residual at each selected node. The S-OPT algorithm tends to choose nodes that are more spread out in the domain, while the DEIM algorithm chooses nodes near the airfoil surface where most of the changes from solution to solution take place. Since the set of neighboring nodes is greater for the S-OPT ROM, the total dimension for the hyper-reduced ROM is larger and each iteration takes longer. It is important to note that this dimension still scales with the reduced dimension, k.

### 4.3 Lagrangian hydrodynamics

In the next two examples, we consider advection-dominated problems arising in compressible gas dynamics. The Euler equation is used to model the high-speed flow and shock wave propagation in a complex multimaterial setting, and numerically solved in a moving Lagrangian frame [harlow1971fluid], assuming no external body force is exerted:

(32) | ||||||

Here, is the material derivative, denotes the density of the fluid, and denote the position and the velocity of the particles in a deformable medium in the Eulerian coordinates, denotes the deformation stress tensor, and denotes the internal energy per unit mass. In gas dynamics, the stress tensor is isotropic, and we write , where denotes the thermodynamic pressure, and denotes the artificial viscosity stress. The thermodynamic pressure is described by the equation of state, and can be expressed as a function of the density and the internal energy. With the assumption of polytropic ideal gas with an adiabatic index , which yields the equation of state

(33) |

The system is prescribed with an initial condition and a boundary condition , where is the outward normal unit vector on the domain boundary.

Using a high-order curvilinear finite element method (FEM) for Lagrangian hydrodynamics [dobrev2012high] for semi-discretization of the Euler equation results in the differential system:

(34) | ||||||

where denotes the FEM coefficient vector functions for velocity , internal energy density and position . In order to obtain a fully discretized system of equations, we apply RK2-average scheme as the time integrator, a modification of the midpoint Runge–Kutta second-order scheme. The RK2-average scheme is written as

(35) | ||||||

(36) | ||||||

(37) |

where the state is used to compute the updates

(38) |

in the first stage. Similarly, is used to compute the updates

(39) |

with in the second stage. Since explicit Runge-Kutta methods are used, we need to control the time step size in order to maintain the stability of the fully discrete schemes. We follow the automatic time step control algorithm described in Section 7.3 of [dobrev2012high].

We use a linear-subspace reduced order model technique for (34) as developed in [copeland2022reduced]. The solution nonlinear subspace (SNS) procedure in [choi2020sns] is used to build the nonlinear term bases and on the right-hand-side of the momentum conservation equation and energy conservation equation separately, and the sampling indices for hyper-reduction of each of nonlinear terms are used to construct the sampling matrices and , respectively. In the following benchmark experiments, we compare the numerical results of reduced order model using the sampling algorithms discussed in Section 3.1, i.e., the oversampled DEIM and S-OPT. With the automatic time step control algorithm, it is very likely that the temporal discretization used in the hyper-reduced system is different from the full order model even with the same problem setting. To this end, we denote by the number of time steps in the fully discrete hyper-reduced system, to differentiate it from the notation for the full order model. The relative error for each ROM field is measured against the corresponding FOM solution at the final time , which is defined as:

(40) |