With the advent of relatively inexpensive and powerful sensors and enhanced data storage capabilities, data sets from various processes are ubiquitous in all fields of engineering. However, processing the data and extracting meaningful information is the bottleneck of the entire data handling pipeline. Despite the huge progress in high performance computing, computational power remains a limiting factor with respect to the amount of information that is generated and processed on a daily basis. To mitigate the load on computational analysis, approaches to extract useful information from the entire data pool are sought. The idea is that not every piece of information is needed to analyse a process. Beneath all high-dimensional data lies considerably lower dimensional patterns which govern most of the dynamics. Modal decomposition techniques have been developed to identify such dominant spatio-temporal modes which dominate the evolution of the system[taira2017modal, gueniat2015dynamic, puzyrev2019pyrom]. This introduces the concept of reduced order models (ROMs) where the entire system can be well represented by relatively compact surrogates, opening ways for computationally inexpensive analysis of the prominent dynamics [quarteroni2015reduced, rowley2017model].
Among the present methodologies that we have at our disposal for ROMs, dynamic mode decomposition (DMD) has rapidly gained recognition during the last few years [schmid2010dynamic, rowley2009spectral, tissot2014model]. DMD shares some roots with the Koopman theory and can be viewed as a data-driven approximation of the Koopman operator spectrum. It has some interesting advantages, especially if the underlying dynamics is quasi-periodic and well characterized by fast decaying singular values. For one, it is completely data driven and does not require any prior knowledge of the dynamics of the concerned system. This makes it simple to use with quite complicated dynamical systems. It can be applied to both experimental as well as numerical data with success. Furthermore, it is theoretically sound and different analyses can be applied to DMD as it is based on the Koopman operator.
Since its introduction in fluid dynamics community [schmid2010dynamic, rowley2009spectral], DMD has been one of the most widely used reduced order modeling strategies in many other disciplines as well [erichson2019compressed, natsume2020application]. With increasing popularity, various strategies have been developed to overcome many of the early DMD shortcomings. Studies to improve its performance via pre-processing and post-processing have been conducted like ensemble-averaging methods [sarmast2014mutual], mean subtraction [hirsh2020centering], de-biasing algorithms [hemati2017biasing], sparsity inducing approaches [jovanovic2014sparsity], online update techniques [hemati2014dynamic, zhang2019online], and improved least square methods [brunton2014compressive, kutz2016multiresolution]. Moreover, linear and nonlinear aspects of DMD have been discussed in [alekseev2016linear]williams2015data]. In addition, efforts have been made to make the DMD algorithm robust against noisy data [schmid2010dynamic, scherl2020robust].
Another important aspect of reduced order modeling by modal decomposition techniques such as DMD involves effective selection of the modes. In proper orthogonal decomposition (POD) based model reduction approaches [sirovich1987turbulence, amsallem2008interpolation, holmes2012turbulence], modes are automatically sorted based on the eigenvalues of the auto-correlation matrix of snapshot data. Those eigenvalues intrinsically represent the amount of total system’s energy captured by the individual modes. In contrast to the POD based approaches, DMD does not inherently rank the underlying modes and the selection criteria is not unique nor straightforward [chen2012variants, jovanovic2014sparsity, sayadi2015parametrized]. Initial implementations [rowley2009spectral, wan2015dynamic] utilized the norm of modes as a ranking and sorting criteria. sayadi2014reduced
used the projection of the first snapshot onto the modes to classify them. However, this method does not perform well for fast-decaying modes.jovanovic2014sparsity provided the computation of optimal DMD amplitudes based on the minimization of the difference between snapshot data and DMD reconstruction over the total time window. Similarly, an optimized DMD algorithm was proposed to compute arbitrary number of DMD modes to improve DMD accuracy when a few modes are sought [chen2012variants]. Ideas from data assimilation, including variational and sequential approaches, have been also adopted to build physically-sound DMD-based ROMs [tissot2014model, tissot20154d, nonomura2018dynamic, nonomura2019extended].
The straightforward implementation approach proposed by rowley2010reduced
involves the computation of a companion matrix that helps to construct, in a least squares sense, the final data vector as a linear combination of all previous data vectors. However,schmid2010dynamic showed that this version may be ill-conditioned in practice and an alternative algorithm based on the singular value decomposition (SVD) of the snapshot data matrix is recommended. Nonetheless, the brute-force computation of the SVD for high dimensional systems becomes computationally prohibitive. In this study, we explore the applicability sketching methods for the efficient computation of DMD basis and spectrum. These methods rely upon informed projections and aim at constructing a smaller (memory-wise) matrix, called the “sketch” while preserving important properties of the original data [mahoney2011randomized, woodruff2014sketching]. The sketch may represent any combination of row space, column space or the space generated by the intersection of rows and columns (core space). There have been several studies that aimed to utilize random projections to derive low-rank randomized DMD [erichson2016randomized, bistrian2017randomized, bistrian2018efficiency, erichson2019randomized, alla2019randomized]. The major contribution of this study is to build on and extend these previous efforts. In particular, we explore the applicability of sketching algorithms in the low-rank DMD computations for spherical shallow water equations data, representing a relatively simple, but very representative model for geophysical flows. The spherical SWEs formulation is often considered a first step in developing general circulation models in large scales. We also explore a sketching algorithm that aims at capturing the range and corange of the snapshot data matrix with free parameters that can be tuned based on the given memory constraints. In addition, we investigate the effect of different sorting criteria onto the accuracy of the low-rank DMD reconstruction. We further show that sketching-based DMD can directly yield a near-target-rank DMD and mitigate the need for special mode-selection criterion.
The rest of the paper is structured as follows. In Section 2, we present the underlying governing equations of shallow water system on spherical coordinates and the corresponding initial and boundary conditions. Numerical schemes used to solve the governing equations for data generation as well metrics for evaluating results are summarized in Section 3. We then detail the dynamic mode decomposition formulations and the considered sketching approaches in Section 4. Results are provided with corresponding discussions in Section 6, while concluding remarks are drawn in Section 7.
2 Mathematical Model
The shallow water equations (SWEs) are the mass and momentum balance equations that constitute a specialized case of the Navier-Stokes equations (NSEs). The SWEs describe the flow field of a free fluid surface in cases where the horizontal length scale dominates over the vertical length scale; implying that the horizontal velocity field is approximately invariant in the vertical direction. Thus, the variation of the vertical component vanishes in the SWEs. Mathematically, SWEs are obtained by the integrated average of the NSEs across the vertical length and substituting the pressure term with the depth of the fluid column through the hydrostatic approximation. The Coriolis term is included to account for the forces due to Earth’s rotation in cases of geophysical flows. Interestingly, in many atmospheric and oceanic flows of practical interest, such assumptions hold and hence, they can be adequately modeled by these equations [williamson1992standard, thuburn2000numerical]. Therefore, SWEs form a good test bed for reduced order modeling algorithms in the context of geophysical flows. The SWEs for the atmosphere on Earth, using spherical coordinates, can be written as follows:
In the above system of equations, where denotes the Earth’s radius ( m) and is the height of the bottom topography; is the acceleration due to gravity ( m/s); is the depth of the water surface; and are the respective velocities in the longitudinal and latitudinal directions, while and are the longitudes and latitudes, respectively.
We consider a spherical domain given by longitudes and latitudes for the purpose of our study. For simplicity, the bottom surface is taken to be flat, i.e, and . Initial condition for is given by:
In order to trigger shear layer instability, we add random disturbance to the initial height as follows:
is a uniformly distributed random number between 0 and 1,is the spatial resolution in the latitudinal direction (one degree in this study), is the number of latitudinal grid points (i.e., points), and is the Coriolis parameter with rad/s being the Earth’s rotation rate.
For initial velocities, geostrophic wind conditions are assumed to prevail and accordingly the initial and are given as:
where is a small constant introduced to prevent the value the denominator from becoming exactly at the equator (). Here, the first order spatial derivative of is numerically computed from the initial condition given by Eq. 4 using central difference scheme. We use periodic boundary conditions in the longitudinal () direction and slip boundary conditions in the latitudinal () direction as given below for the boundary grid points:
where , and (corresponding to the domain boundaries).
3 Numerical Methods
We use the Lax-Wendroff method for solving the set of partial differential equations of the SWEs on the sphere. It is a second-order accurate scheme in both time and space. Consider a generic variable, which has a conservation equation defined as:
Now, the first step in the Lax-Wendroff scheme is to estimate the value ofat the midpoints in space and time as:
where the superscript denotes the time index corresponding to discrete time while the subscripts define the spatial index with respect to and directions, respectively. is the time step size while and are the corresponding spatial resolutions of the spherical grid. The value of at next time step is given in the second step as follows:
As a last step, before we move to the next iteration, the primitive quantities are updated by adding the source terms as follows:
Although more sophisticated numerical methods that address inherent numerical difficulties are available for solving spherical SWE [neta1997analysis], in this work, without considering poles (i.e., ), we simply deploy the described Lax-Wendroff scheme to numerically solve the SWEs on a sphere to obtain the solution of the surface flow dynamics [connolly2017]. This method is applied to each of the three equations of spherical SWEs simultaneously to obtain the height () as well as the velocity ( and ) fields. For the forward simulations, we use spatial resolution and and a time step of sec. The total time duration for the simulation is set to days and the snapshots of the field data are stored at every minutes. Moreover, we truncate the snapshots corresponding to the first days from our input data matrix to let the system pass the initial transition period. Thus, we get data set in the form of snapshots arrays of dimensions (corresponding to the spatial resolution) for each flow field, where each snapshot corresponds to a particular time. In the present study, we consider the vorticity field data, which is defined as the curl of the velocity vector, i.e.,
Eq. 26 in the spherical coordinate can be simplified for the 2D case as:
Eq. 27 is discretized using central difference scheme at the central grid points. For the boundary grid points, we similarly use periodic boundary condition in the longitudinal direction and slip condition in the latitudinal direction.
4 Dynamic Mode Decomposition
Dynamic mode decomposition (DMD) is one of the most popular methods for data-based reduced order modeling. The core DMD framework by itself is completely data-driven and does not require prior knowledge of the model dynamics. Let (where ) be the state of a system that evolves in time through some specific dynamics as , where
can be a nonlinear function. The objective of DMD is to identify the leading DMD eigenvalues and the corresponding eigenvectors of the best fit operatorwhich would evolve the system linearly in time as:
By estimating the operator , spatially coherent DMD modes can be computed, each of which is associated with a growth/decay rate and a frequency that define its time dynamics. Dimensionality reduction is obtained by choosing a few significant modes while rejecting the others. DMD was formally introduced to the fluid community by schmid2011applications. In problems of fluid dynamics, the system dimension can be very large to account for the relevant scales in the flow field, and frameworks such as DMD are of high interest. In our case, we look at SWEs on the sphere as our governing dynamical system (described in Section 2).
In practice, one would have data of the system state collected at various time instants . So, here
is the number of degrees of freedom in the system whileis the number of discrete time steps for which data is available (i.e., number of collected snapshots). This data can be either actual field data from physical experiments or even synthetic data generated from high fidelity numerical computations. In the present study, data is obtained from the solution of the governing equations as described in Section 3, which constitute the full order model (FOM) of the system of interest. We focus on the vorticity field on the spherical domain . Each vorticity snapshot matrix of dimension (i.e., ), corresponding to a particular time , is rearranged in a column vector . So, we have a dynamical system where the system state denotes the vorticity field on a spherical domain at a particular time . Thus, the full data matrix ( for our case) is formed as:
The primary objective of DMD is to extract the eigenvectors and eigenvalues of the matrix such that:
where Eq. 30 is the time discretized form of Eq. 28. In essence, both are similar in that they denote the evolution of the system states as a linear operation. Next, the full data set is split into two matrices as defined below:
Thus, Eq. 30 ca be rewritten in matrix format as follows:
The operator , representing the best linear fit, can be computed through the least squares optimization of:
where is the Frobenius norm. In the present study, we investigate different ways to approximate the eigenvalues and eigenvectors to form DMD modes. Ideally, the full rank matrix would have dimension, which would require high computational power to operate when is very large (which is the case in practical fluid dynamics problems). So, the motivation here is to seek ways to replace it with lower rank approximations. This will be described in detail in Section 4.1
Once the DMD modes are computed, we can form a matrix where each column represents a DMD mode. DMD inherently does not provide robust ranking of the modes unlike other algorithms such as proper orthogonal decomposition (POD). So, an additional criterion has to be incorporated to appropriately rank these modes. The next step is to truncate the DMD mode matrix to where () is the number of retained modes. The information of the time dynamics in each DMD basis is inferred from the corresponding eigenvalues . Here, are the discrete-time eigenvalues, while the continuous-time eigenvalues can be evaluated as follows:
where is the time interval between two consecutive snapshots and constitutes the vector of the continuous-time eigenvalues. Next, we reconstruct the high dimensional dynamics from the lower order subspace as:
where is the vector of initial amplitudes of the DMD modes given as:
where is the pseudoinverse of . In Eq. 35, the superscript on is the index for discrete time (i.e., ), while the superscripts (without parenthesis) on and refer to powers. Finally, for reconstruction, Eq. 35 can be written in terms of the continuous-time eigenvalues by using the relation in Eq. 34 as follows:
In Eq. 37, the real part of the continuous-time eigenvalues is responsible for the growth/decay rate of the mode. The mode grows for a positive real part and conversely decays over time for a negative real part. On the other hand, the imaginary part determines the oscillating frequency of the mode.
4.1 Deterministic DMD
In the standard form of the DMD, the linear operator is projected onto a lower -dimensional subspace to replace the full dimensional matrix. The projection of is referred to as . The first step is to take the singular value decomposition (SVD) of the matrix as (where denotes the complex conjugate transpose of matrix ). Compact SVD can be computed such that , where and denote the matrix of the first columns of and respectively, while is the first dimensional sub-block of , with being the rank of .
The projection of onto the -dimensional space is taken as:
Using Eq. 38, the optimization problem in Eq. 33 becomes , which gives . The eigenvalues of represent the Ritz values (approximate eigenvalues) of . These eigenvalues (also called DMD eigenvalues) are computed as . It should be noted that the eigen decomposition of results in complex eigenvalues and eigenvectors. Here, each columns of are the eigenvectors while is the diagonal matrix containing the corresponding eigenvalues. The DMD modes can be evaluated as follows:
while the so-called “exact” DMD modes can be written as [kutz2016dynamic, tu2014dynamic]:
A suitable reduced order approximation can be obtained by retaining columns of . From here, the continuous-time eigenvalues can be easily calculated from Eq. 34 and the estimated higher order dynamics from the lower order approximation can be found from either Eq. 35 or Eq. 37. Algorithm 1 shows the standard SVD-based DMD implementation.
4.2 Rank truncation and mode selection
Mode selection forms a crucial part of ROMs since the entire concept of ROMs is premised upon the concept of predicting the system dynamics based on the projection of the full order dynamics onto a comparatively small number of bases (called modes). Although the DMD results in modes each of which is associated with a unique oscillation frequency and growth/decay rate, it does not possess an inherent sorting criteria. Therefore, user-specific rules have to be set to select the most influential modes, and here we discuss some of the approaches to do so.
4.2.1 Early truncation
One can simply utilize the fact that the DMD computation involves a singular value decomposition, resulting in hierarchically arranged bases for the snapshot data matrix . Therefore, an early truncation can be performed by using an -rank approximation of by setting in Section 4.1. This early truncation eventually yields exactly DMD modes, with no need for further sorting or selection mechanisms. Nonetheless, it has been shown in previous studies (e.g., [ahmed2020sampling]) that this approach might produce a low-quality reconstruction and prediction, especially for small values of . So, we take a look at some computationally simple sorting criteria in the following subsections.
4.2.2 Sorting criterion 1
One of the early and straightforward sorting criteria is to order the modes based on their initial amplitudes as given by Eq. 36. In particular, we can define the importance index as:
where denotes the magnitude. However, this mechanism does not take into account that for specific systems, there might be modes with large initial amplitudes but rapidly decaying and/or modes with small initial amplitudes but rapidly growing.
4.2.3 Sorting criterion 2
We can define the modal contribution to the FOM dynamics by considering the quickly decaying modes with high initial amplitude as well as those with low initial amplitude, but rapid growth rate, as both cases are significant to be taken into account. Thus, the importance index can be written as:
where stands for the associated growth or decay rate of the particular DMD mode. This criterion depends on the initial amplitude and the growth rate of the modes. It was shown that this criterion provides improved results for similar SWE systems on Cartesian coordinates [ahmed2020sampling].
4.2.4 Sorting criterion 3
kou2017improved provide a criterion for evaluating the contribution of DMD modes and sorting them by estimating the influence of each DMD mode over the entire sampling window based on their temporal evolution. The influence of the modes can be evaluated as:
where are the discrete-time eigenvalues. The subscript in refers to the mode, while the superscript is the power and is the number of collected snapshots.
4.2.5 Sorting criterion 4
We also introduce a scheme for sorting the DMD modes based on their initial amplitudes and their growth/decay rates, similar to the one proposed in [kou2017improved]. The general idea is to obtain an estimation of each mode’s contribution from the projection of the FOM onto that DMD mode (or basis) over the entire sampling window. In Eq. 37, can be thought of as the projection of the FOM onto the DMD mode at time . Taking an integral with respect to time of the absolute of this quantity over the entire sampling window () gives an average contribution of the modes for the given sampling window as follows:
where is the total sampling time and is the number of snapshots available. Then, Eq. 44 can be approximated as follows:
It is interesting to note here that the performance of a mode selection criterion is dependent on the dynamics of the system it is applied to as well. All DMD modes can be assigned a value based on their contribution to the FOM and sorted in decreasing order of . Then, the first modes are selected based on their importance index . We also highlight that physically conserving sorting and selection criteria might be tailored for the specific problem in hand. For example, in the case of SWEs, conservation of integral invariants (e.g., total mass, total energy and potential enstrophy) can be enforced as hard constraints for the optimization and selection of the most important modes [bistrian2017method]. In other words, if we know something about the system being analyzed (e.g., constitutive laws), we can enforce the sorting criteria to improve the DMD results.
5 Sketchy Dynamic Mode Decomposition
As we discussed before, one of the main bottlenecks of the computational pipeline of the stable DMD implementation is the size of the given data matrix. In particular, for high-dimensional systems, the size of the input data matrix can be very large and loading to perform SVD becomes computationally expensive. Streaming algorithms can be adopted to mitigate this burden by passing the snapshots one-by-one and performing SVD on increments of the data [hemati2014dynamic, pendergrass2016streaming]. Another approach, that we consider in the present study, is to utilize sketching as a tool to reduce the size of the processed data matrix. In particular, we seek a low-order embedding of the original data matrix represented by a smaller matrix, called the sketch that captures the main information of the original one. The success of the sketching method relies on the assumption that big data matrices are often low-rank [udell2019big], with an exponential decay of the underlying singular values. Indeed, this is the main reason why model order reduction has witnessed substantial success in many applications.
Sketching-based algorithms exploit informed projections to transform the original data matrix to a more compact one. The expensive computations are thus performed onto the latter, after which a post-processing takes place to map (and maybe correct) the outputs to the original space. Randomized projections have been especially effective for this purpose. They can efficiently and accurately extract the spatiotemporal coherent structures from high-dimensional data, with high probability
with high probability. We consider three variants of sketching-based DMD, to reduce the computational costs of different portions of the deterministic DMD algorithm.
5.1 Sketching the range of
bistrian2017randomized introduced a randomized DMD framework that aims at mitigating the cost of the SVD of the data matrix . In particular, a near-optimal basis with a target rank is defined using random projections such that it captures the range of and provides a smaller sketch matrix . This results in increased efficiency in terms of computational memory and/or time for later deterministic steps of model reduction.
In randomized DMD, a near-optimal orthonormal basis is estimated from the full order dense data matrix such that . In order to do this, a randomized projection of the input data matrix is first performed as follows:
whereis a summary of the action of . We highlight that the computational cost of this matrix-matrix multiplication can be reduced by exploiting structured randomized matrices. Here, represents the target rank where is defined as the oversampling factor that helps to obtain an improved basis. bistrian2017randomized suggested using an oversampling factor of , which we adopt in the present study.
Next, a QR decomposition ofis performed where and . Indeed, we only need for our computations, and can safely discard . The full data is then projected onto the basis to obtain a lower dimension matrix as follows:
where is the conjugate transpose of . Thus, SVD can be performed on instead of as . The singular values and vectors of can be recovered as follows:
The algorithmic steps for the randomized DMD based on sketching the range of are summarized in Algorithm 2
5.2 Sketching the range of
In Section 5.1, the deterministic SVD of is bypassed by a randomized projection of to obtain a sketch that captures its range and efficiently applying SVD onto this sketch. After the SVD of is recovered, the remaining steps are performed in a high-dimensional space. Alternatively, erichson2019randomized developed a randomized DMD algorithm that is based on sketching the range of and performing all steps in the DMD procedure in a low-order space, while the DMD of the original system is recovered at the very end. Therefore, a random projection is defined to capture the range of as follows:
where is a random matrix and is the summary of the action of . Next, a QR decomposition of is performed where and , where is considered a near-optimal orthonormal basis such that . The full data is projected onto to obtain a lower dimension matrix as follows:
can be further split into two matrices and by selecting the first and last columns of , respectively.
The whole DMD algorithm can be applied onto the low-order sketches to yield the DMD modes of the low-order matrix , and the DMD modes are recovered at the end as follows:
The algorithmic steps for the randomized DMD based on sketching the range of are summarized in Algorithm 3
5.3 Sketching the range and corange of
In this study, we further develop an efficient DMD implementation based on a sketching approach that captures the range and corange of the data matrix , resulting in an even smaller sketch than the range sketches [yurtsever2017sketchy, tropp2019streaming]. We first describe the sketching operators parametrized by a “range” parameter and a “core” parameter that satisfy , where the parameter determines the maximum rank of an approximation.
Now, we independently draw and fix four randomized linear reduction maps (often called test matrices) as follows:
Then, we define three matrices comprising our sketch as follows:
where and capture the range and corange of , respectively, while is called the core sketch and describes how acts between the spaces captured by sketches and [rajaram2020randomized]. Now, near-optimal bases for the range and corange of are computed by the thin QR factorization of and as follows:
where we can again discard the triangular matrices and . The third sketch is used to compute the core approximation of as follows:
which can be implemented efficiency by solving a family of least-squares problems. The original data matrix is now related to the core sketch by the following relation:
Note that is a square matrix with small size , compared to and in Section 5.1-5.2. Thus, we use the core sketch to compute the SVD of efficiently and the singular values and vectors of can be recovered as follows:
The algorithmic steps for the randomized DMD based on sketching the range of are summarized in Algorithm 4