1 Introduction
Recently, researchers have shown sustained interest in using machine learning (ML) methods for nonintrusive reducedorder models (ROMs) for systems that may be governed by advectiondominated partial differential equations (PDEs). This is because solving PDE forwardmodels for such systems may require very fine spatiotemporal numerical discretizations which cause a significant bottleneck in design and forecast tasks
[44]. The prospect of bypassing traditional numerical methods and building surrogates from data alone [26, 37, 36, 41] is attractive for multiple applications ranging from engineering design [4, 38] and control [35, 22] to climate modeling [8, 9, 28]. This is because datadriven ROMs allow for rapid predictions of nonlinear dynamics unencumbered by the stabilitybased limitations of numerical discretizations. In almost all ROM applications, forecasts must be conditioned on time and several control parameters, such as the initial conditions or the physical properties of the governing laws. In addition, since these models eschew the use of PDEs, it is necessary to associate some notion of uncertainty quantification during the time evolution of these surrogate dynamical systems. This is to ensure that the loss of interpretability and reliability by bypassing equations is offset by a feedback process from the ML.Neural networks have been used for ROMs for decades. One of the earliest examples [14]
used a simple fully connected network for forecasting meteorological information. More recently, researchers have incorporated a singlelayered feedforward neural network into a nonlinear dynamical system and built a surrogate model for a highdimensional aerodynamics problem
[27]; radial basis function networks have been used to make forecasts for a nonlinear unsteady aerodynamics task
[51, 21]; and a simple fully connected network has been used for learning the dynamics of an advectiondominated system [40, 13].Neural networks are commonly used for two tasks in typical ROM construction: Compression and timeevolution. For the former, they may be used as a nonlinear equivalent of the proper orthogonal decomposition (POD) or principal component analysis (PCA) based methods which find a linear affine subspace for the full system. The identification of this reduced basis to ensure a compressed representation that is minimally
lossy is a core component of most ROM development strategies (some examples include [39, 19, 16]). While PODbased methods currently represent the most popular technique for reducedbasis (or latent space) construction, data generated from PDE simulations can often be interpreted as images on a square grid; therefore, convolutional neural networks have also been applied
[43, 17, 12, 30] for building computationally effective ROMs.Once this basis (or latent representation) is identified, we need a costeffective strategy for accurate nonlinear dynamical system evolution to reproduce the fullorder spatiotemporal complexity of the problem in the reduced basis. For example, linear reducedbasis construction allows for the use of intrusive methods (which project the governing equations onto the reducedbasis), as seen in [15, 34], which use a Galerkin projection; or [6, 49, 11], which use the Petrov–Galerkin method (see [5]
for the comparison of these two methods). Such extensions are not straightforward for autoencoder based latent space constructions, since projecting to a basis space is infeasible. We note, though, that the use of convolutional autoencoder architectures has been demonstrated for the Petrov–Galerkin method, where there is no requirement to project the governing equations onto a trial space
[24]. As mentioned previously, neural networks are also commonly used for the temporal evolution of these latent space representations. Mainstream ML has generated a large body of timeseries forecasting literature that lends itself readily to latentspace evolution of dynamical systems. Recently, long shortterm memory architectures (LSTMs) have become popular for the nonintrusive characterization of dynamical systems
[45, 1, 32, 33, 46] as they allow for the construction of noni.i.d, directional relationships between the states of a dynamical system at different times before a future forecast is performed. Other methods that have been utilized for such tasks include the temporal convolutional network [50], nonautoregressive variants for the LSTM [29] and systemidentification within latent space [7]. In most of these developments, the evolution framework (in time) is deterministic in nature. These frameworks highlight a crucial drawback when coupled with the blackbox nature of purely datadriven ROMs – it is imperative to embed some notion of uncertainty into the time evolution of ROMs in the latent space. Past work has shown that LSTMs suffer from stability and interpretability issues when utilized for the reducedorder modeling of advectiondominated systems [29] and this study proposes an alternative to provide interpretable forecasts with quantified uncertainty. To that end, we propose the use of a Gaussian process regression (GPR) framework for evolving the lowdimensional representation of data obtained from a PDE evolution. In addition to providing forecasts at the temporal resolution of the training data, the use of the our framework allows for interpolation in time. This is because time is interpreted as a continuous variable. When coupled with quantified uncertainty, the framework allows a ROM user to interrogate the behavior of the emulated system at a finer temporal resolution, with implications for data sources that are temporally sparse.To summarize, the major contributions of this article are:

We introduce a technique to construct parametric nonintrusive ROMs with quantified uncertainty during latentspace time evolution of dynamical systems.

We detail the use of Gaussian processes (GPs) that are customized for the timeevolution of parametric nonlinear dynamical systems.

We demonstrate the ability of the proposed timeevolution algorithm for systems compressed by linear reducedbasis methods such as POD, as well as nonlinear compression frameworks such as variational and convolutional autoencoders.

We test the ability of a ROM constructed from coarse training data to interpolate on a finer temporal resolution with quantified uncertainty.
In the following, we shall introduce our experimental setup which relies on the inviscid shallow water equations in Section 2, our various compression mechanisms in Section 3, the UQ embedded emulation strategy in Section 4, followed by results and conclusions in Sections 5 and 6 respectively.
2 The shallow water equations
The inviscid shallow water equations belong to a prototypical system of equations for geophysical flows. In particular, the shallow water equations admit solutions where advection dominates dissipation and pose challenges for conventional ROMs [47]. These governing equations are given by
(1)  
(2)  
(3) 
where, corresponds to the total fluid column height, and is the fluid’s horizontal flow velocity, averaged across the vertical column, is acceleration due to gravity, and is the fluid density, typically set to . Here, , and are the independent variables of time and the spatial coordinates of the twodimensional system. Equation 1 captures the law of mass conservation, whereas Equations 2 and 3 denote the conservation of momentum. The initial conditions of the problem are given by
(4)  
(5)  
(6) 
i.e., a Gaussian perturbation at a particular location on the grid . We solve the system of equations until with a timestep of seconds on a square twodimensional grid with collocation points to completely capture the advection and gradual decay of this perturbation. Note that these numbers may vary according to the forecasting and fidelity requirements of a particular problem and perturbation. The initial and boundary conditions for this particular shallowwater equation experiment represent a tightlycontrolled traveling wave problem that is translation invariant. Different realizations of the initial condition lead to translationally shifted trajectories. We also note the presence of mirror symmetries with respect to and coupled with a rotational symmetry of radians about the origin. However, our motivation for a first assessment of our emulators on this system stems from the wellknown fact that POD and Galerkinprojection based methods are severely limited in their ability to forecast on these simple travelingwave systems [23, 31] and require special treatment with intrinsic knowledge of the flow dynamics. This is in addition to the fact that equationbased models are impossible to construct because of the absence of information from the other variables of the PDE. We seek to build predictive models solely from observations of conditioned on mimicking a realworld scenario where complete observations of all relevant variables (in this case, velocities) are unavailable.
3 Datacompression
3.1 Proper orthogonal decomposition
In this section, we review the POD technique for the construction of a reduced basis [20, 2]. The interested reader may also find an excellent explanation of POD and its relationship with other dimensionreduction techniques in [42]. The POD procedure is tasked with identifying a space
(7) 
which approximates snapshots optimally with respect to the norm. The process of generation commences with the collection of snapshots in the snapshot matrix
(8) 
where is the number of snapshots, and corresponds to an individual snapshot in time of the discrete solution domain with the mean value removed, i.e.,
(9) 
with
being the timeaveraged solution field. Our POD bases can then be extracted efficiently through the method of snapshots where we solve the eigenvalue problem on the correlation matrix
. Then(10) 
where is the diagonal matrix of eigenvalues and
is the eigenvector matrix. Our POD basis matrix can then be obtained by
(11) 
In practice a reduced basis is built by choosing the first columns of for the purpose of efficient ROMs, where . This reduced basis spans a space given by
(12) 
The coefficients of this reduced basis (which capture the underlying temporal effects) may be extracted as
(13) 
The POD approximation of our solution is then obtained via
(14) 
where corresponds to the POD approximation to . The optimal nature of reconstruction may be understood by defining the relative projection error
(15) 
which exhibits that with increasing retention of POD bases, increasing reconstruction accuracy may be obtained. We remark that for dimension , the solution variables may be stacked to obtain this set of bases that are utilized for the reduction of each PDE within the coupled system. Another approach may be to obtain reduced bases for each dependent variable within the coupled system and evolve each PDE on a different manifold. Each dependent variable is projected onto bases constructed from its snapshots alone. This affects the computation of the nonlinear term for computing the updates for each dimension in . In practice, this operation manifests itself in the concatenation of reduced bases to obtain one linear operation for reconstruction of all field quantities.
3.2 Convolutional autoencoders
Autoencoders are neural networks that learn a new representation of the input data, usually with lower dimensionality. The initial layers, called the encoder, map the input to a new representation . The remaining layers, called the decoder, map back to with the goal of reconstructing . The objective is to minimize the reconstruction error. Autoencoders are unsupervised; the data is given, but the representation must be learned.
More specifically, we use convolutional autoencoders (CAEs) that have convolutional layers. In a convolutional layer, instead of learning a matrix that connects all neurons of layer’s input to all neurons of the layer’s output, we learn a set of filters. Each filter is convolved with patches of the layer’s input. Suppose a onedimensional convolutional layer has filters of length . Then each of the layer’s output neurons corresponding to filter is connected to a patch of of the layer’s input neurons. In particular, a onedimensional convolution of filter and patch is defined as (for neural networks, convolutions are usually technically implemented as crosscorrelations). Then, for a typical onedimensional convolutional layer, the layer’s output neuron , where
is an activation function and
are the entries of a bias term. Asincreases, patches are shifted by stride
. For example, a onedimensional convolutional layer with a filter of length and stride could be defined so that involves the convolution of and inputs , and . To calculate the convolution, it is common to add zeros around the inputs to a layer, which is calledzero padding
. In the decoder, we use deconvolutional layers to return to the original dimension. These layers upsample with nearestneighbor interpolation.Twodimensional convolutions are defined similarly, but each filter and each patch are twodimensional. A twodimensional convolution sums over both dimensions, and patches are shifted both ways. For a typical twodimensional convolutional layer, the output neuron . Input data can also have a “channel” dimension, such as red, green and blue for images. The convolutional operator sums over channel dimensions, but each patch contains all of the channels. The filters remain the same size as patches, so they can have different weights for different channels. In our study, we use solely one channel for the spatial magnitude of .
It is common to follow a convolutional layer with a pooling
layer, which outputs a subsampled version of the input. In this paper, we specifically use maxpooling layers. Each output of a maxpooling layer is connected to a patch of the input, and it returns the maximum value in the patch.
3.3 Variational autoencoders
As opposed to CAEs, the variational autoencoder (VAE) [18] takes a Bayesian approach to latent space modeling. We utilize a convolutional VAE architecture, where outer convolutional, pooling and upscaling layers are identical to CAE. The difference only arises in the bottleneck, where the encoder final layers are fully connected layers that project the input onto the latentvariable space. Thus, the encoder effectively is a function . The fully connected part of the decoder network generates newly sampled
from the latent space. The output of this undergoes upsampling and convolutions (similar to that of the CAE) to reconstruct the image. The VAE also constrains the latentspace variables to follow a normal distribution
via the Kullback–Leibler divergence (KL divergence) term
. The architecture of VAE (shown in Figure 3) is similar to that of the conventional CAE shown in Figure 2, except at the bottleneck, where the encoder network outputs the meanand variance
.The inference itself is undertaken using variational inference (VI) by modeling the true distribution
using a simple Gaussian distribution, and then minimizing the difference via the KL divergence as an addition to the loss function
. The KLdivergence loss is applied such that the distribution on is as close to the normal distribution as possible.With the inclusion of additional complexity to the loss function and the bottleneck, VAEs have more parameters to tune than CAEs. However, this also gives a significant control over the latentspace distribution. Depending on the data, dimensionality reduction using CAEs may not allow for a straightforward interpolation due to the presence of discontinuous clusters in the latentspace representation. On the other hand, VAEs constrain the latentspace representation to follow a prespecified distribution, hence by design, facilitate easier interpolation.
4 Gaussian process regression
Dimensionality reduction performed using POD, CAE or VAE techniques results in a representation of the original data, along with a model (the POD bases or decoders) to reconstruct the data for any point in the latent space. We then only require a temporal interpolation scheme fitted on the representation space, so that a time evolution of the dynamical system can be reconstructed, as schematically shown in Fig 1. In our approach, we deploy the use of GPs [48] as our interpolation algorithm. While GPs perform Bayesian regression tasks with a high level of interpretabilty, they are generally computationally expensive for large data sizes. We achieve a considerable reduction of computational cost by fitting in the space of reduced dimensions. In addition, the GPR may also be restricted to a less noisy space compared with the original dataset of the dynamical evolution.
A GP is an accumulation of random variables, every finite collection of which follows a multivariate Gaussian distribution. It can be perceived as a generalization of a multivariate Gaussian distribution with infinite space. GPs are a popular choice for nonlinear regression due to their flexibility and ease of use. In addition, one of their main advantages, is that they incorporate a principled way of measuring the uncertainty information, since they provide predictions in distributional form. For the purpose of this paper, a GPR model is used to fit the reduced space from the datacompression algorithms of Section
3. Subsequently, the mean prediction, which corresponds to the maximum a posteriori (MAP) estimate, is used for the reconstructions. We use the GPflow library for the experiments
[10].A GP can be completely specified by its secondorder statistics. Thus, a mean function equal to zero can be assumed, and a positive definite covariance function (kernel) , which can be perceived as a measure of similarity between and , is the only requirement to specify the GP.
For the experiments in Section 5 we initially experimented with changepoint kernels [25]. The intuition came from the data (see e.g. Figure 11), where two typical behaviours are commonly observed; a steep increase or decrease of the latent dimension values for early times, and a subsequent, smoother change in direction, that eventually leads to stabilization. At first we examined a changepoint kernel only for the time feature, which was then added to a regular kernel that accounted for the other variables. The results were discouraging, which we attribute to the fact that this kernel structure leads to loss of correlational information between time and the other variables. Subsequently, we examined changepoint kernels that accounted for all parameters in both their subkernels. Even though this type of kernel was successful in producing acceptable results, and also detecting adequately the position of the changepoint, the output was only a slight improvement when compared to a standard kernel. Furthermore, the computational time was substantially larger, which led us to abandon the pursuit of a changepoint kernel. Instead, we settled on a Matérn 3/2 kernel,
(16) 
due to its versatility, flexibility and smoothness, and more specifically its automatic relevance determination (ARD) extension [3], which incorporates a separate parameter for each input variable and gave a significant improvement in our results.
For a GPR model, we considered a GP and noisy training observations of datapoints , derived from the true values with additive i.i.d. Gaussian noise with variance . In mathematical form, that is:
(17) 
where is the kernel. We obtain the complete specification of the GP, by maximizing the marginal likelihood, which we can acquire by integrating the product of the Gaussian likelihood and the GP prior over :
(18) 
For testing input and testing output , we derive the joint marginal likelihood:
where
is the identity matrix.
Finally, by conditioning the joint distribution on the training data and the testing inputs, we derive the predictive distribution
(19) 
where
(20) 
During the reconstruction phase, we focus on the predictions that correspond to .
5 Experiments
In this section, we outline several experiments designed to assess the various compression frameworks and how they interface with latentspace emulation using our aforementioned GPs. A first series of assessments is solely targeted at assessing the fidelity of reconstruction (i.e., which framework offers the most efficient compression). Following this, we interface latentspace representations of our compressed fields with GP emulators to obtain lowdimensional surrogates with embedded uncertainty quantification. Finally, we outline the ability for the GP emulators to predict the dynamics’ evolution at finer temporal resolutions.
For the purpose of training the compression frameworks and the GP emulators, we generate forwardmodel solves which are obtained by a Latin hypercube sampling of different initial conditions between and for each dimension and . For each of these simulations, evenlyspaced snapshots in time are obtained to construct our total data set (i.e., we have flowfields for of resolution ). We remind the reader that the equationbased simulations require the solution of a system of PDEs (for , and ), but our emulators will be built from information alone. We split the simulations into for training, for validation and for testing. The validation data set is primarily utilized for early stopping criteria in the deep neural network autoencoders. Note that the training and validation data are combined into one data set for training GP emulators. All statistical and qualitative assessments in the following will be shown for the testing data alone.
5.1 Reconstruction
We begin by assessing the ability of our different compression frameworks, i.e., the POD, CAE and VAE. This comparison is obtained by training multiple encoders with varying degrees of freedom (DOF) in the latent space. The results of these experiments can be seen in Table
1. Here, we have chosen , , , , and DOF for all of our compression frameworks and have compared the fidelity of the reconstruction. We use metrics given by the coefficient of determination (), mean squared error (MSE), and mean absolute error (MAE) to compare the true and reconstructed fields for testing data sets. Our analysis of the metrics indicate that the CAE is able to reach optimal reconstruction accuracy faster than both the POD and VAE. Both the VAE and CAE are seen to possess an advantage over the POD, due to their ability to find a nonlinear lowdimensional manifold. POD, instead, obtains a linear affine subspace of the highdimensional system. Interestingly, with increasing DOF in latent space () the POD method is seen to outperform the VAE. We also note that the CAE and VAE frameworks obtain their peak accuracy at around 8 DOF in the latent space and proceed to saturate in accuracy with greater latent space dimensions. We remark that this aligns with the lack of any guarantees on convergence with increasing DOF for these nonlinear compression frameworks. POD, instead, is guaranteed to converge with increasing latent space dimensions.Coefficient of determination  
Model/Latent DOF  2  4  8  16  30  40 
POD  0.10  0.30  0.55  0.69  0.82  0.87 
CAE  0.37  0.87  0.91  0.88  0.91  0.91 
VAE  0.35  0.66  0.86  0.83  0.82  0.79 
Mean squared error  
Model/Latent DOF  2  4  8  16  30  40 
POD  0.0025  0.0021  0.0014  0.0010  0.00063  0.00045 
CAE  0.0017  0.00034  0.00025  0.00031  0.00026  0.00025 
VAE  0.0017  0.00084  0.00036  0.00043  0.00045  0.00052 
Mean absolute error  
Model/Latent DOF  2  4  8  16  30  40 
POD  0.029  0.027  0.021  0.017  0.013  0.011 
CAE  0.021  0.0090  0.0075  0.0083  0.0075  0.0076 
VAE  0.023  0.015  0.0094  0.010  0.010  0.011 
Following these quantitative assessments, we assess the reconstruction fidelity of the different frameworks by comparing contours from the different methods with varying DOF in latent space. Figure 4 shows the performance for our three compression frameworks for four DOF in the latent space. At this coarse resolution (in latent space), the linear compression of POD is inadequate at capturing the coherent features in the solution field upon reconstruction. This is due to the advective nature of the dynamics of this data set and the associated high Kolmogorov width. In contrast, both CAE and VAE are able to identify coherent structures in the flow field after being reconstructed from a latent space. Note, however, that the CAE is able to identify the crests and troughs in the flow field in a more accurate manner. We observe similar results from an eightdimensional latent space where the CAE and VAE are still seen to outperform POD (although improvements in the latter may be observed). For both cases (i.e., four and eight DOF), the CAE and VAE are seen to struggle with reconstructing the dissipating coherent features later in the evolution of the system. For completeness, we also show a result for a fortydimensional latent space in Figure 6, where POD can be seen to capture the spatiotemporal trends of the true solution appropriately. Note that improvements in the CAE and VAE are marginal with the POD outperforming both these frameworks later in the evolution of system (at ).
5.2 Latent space forecasts
Next we test the ability of our trained GP emulators to forecast the evolution of systems in their latent space representations. For this assessment, we choose trained encoders (with 4,8 and 40 latent space dimensions) to obtain training and validation data for fitting our previously introduced GPs. Once trained, the GPs are tasked with predicting the evolution of the latent state in reduced space for a set of test initial conditions.
Figure 7
shows the latent space evolution of a testing simulation over time for the three different compression methodologies. This result utilizes solely four latent dimensions. It is readily apparent that CAE compression leads to a smooth evolution of the system in latent space. In contrast, POD and VAE methods display significant oscillations. The GP is able to capture the behavior of the system evolution and also provides confidence intervals which are based on two standard deviations around the mean. With the exception of a few instances in time, the confidence intervals are able to envelope the true evolution of the system. We see similar results in Figure
8 where eight latent space DOF are obtained for each of our compression frameworks. While the POD is seen to provide oscillatory system evolution (as before), the CAE and VAE are smooth. In either case, the constructed GP is able to recover the evolution well.5.3 Reconstruction from latent space forecasts
We proceed by assessing the fidelity of the reconstructions from latent space using the GP forecasts of the previous subsection. Qualitative comparisons for the reconstruction of a test simulation are shown in Figure 9 for a fourdimensional latent space at three different times. The CAE and VAE compression is seen to outperform the linear reducedbasis constructed by POD. This aligns with past studies where nonlinear compression methods have outperformed the POD. The CAE is seen to be more accurate than the VAE due to its deterministic formulation during compression. We observe the CAE and VAE to outperform the POD for the eightdimensional latent space as well, as shown in Figure 10. Qualitatively, the CAE is seen to be the best compression technique of all the methods. Table 2 shows different metrics to establish these conclusions quantitatively. Note that all these metrics are evaluated on the reconstructed data in physical space.
Coefficient of determination  
Model/DOF  4  8 
POD  –3.89  –1.99 
CAE  0.86  0.87 
VAE  0.63  0.82 
Mean squared error  
Model/DOF  4  8 
POD  0.034  0.0041 
CAE  0.00032  0.00029 
VAE  0.00086  0.00046 
Mean absolute error  
Model/DOF  4  8 
POD  0.030  0.044 
CAE  0.0085  0.0076 
VAE  0.014  0.010 
5.4 Temporal superresolution
It must be highlighted that MLbased timeseries forecasting methods are generally formulated in a discrete fashion where the temporal resolution of the training data determines the resolution of the ROM deployment. Also, most studies of MLbased forecasting in time assume a regular sampling of the state in time. In practice, due to simulation or experimental limitations, state information may be available sparsely in time and at irregular intervals. Therefore, the construction of a ROM that is continuous in the temporal variable allows us to sample at intermediate time steps with quantified uncertainty. Therefore. we test the ability of our parameterized nonintrusive ROM for interpolation in time. In this assessment, we establish the performance of the ROM to sample at locations (in time) that were not obtained in the discrete training and testing data. This is made possible due to the continuous function approximation property of the GPR.
To assess this capability, we regenerate our testing data which is now sampled times more finely in the temporal dimension. We utilize our pretrained CAE (solely trained on the coarsely sampled training data), to compress this system evolution to an eightDOF latent space. Following this, our previously trained GP is tasked with sampling at the intermediate points in time, which correspond to this finely sampled testing data. Note that the GP is trained only with the coarse data set as well. Thus, this assessment represents interpolation in both space and time. The results for the GP forecast in this assessment are shown in Figure 11. A good agreement can be observed between the true latent space trajectory and the GP interpolated counterpart. We also direct our attention to the forecast behavior for latent dimensions , and where uncertainties are seen to oscillate corresponding to the coarsely sampled training points. We qualitatively assess the accuracy of the temporal interpolation as shown in Figure 12, where the reconstruction from the true latent space trajectory and the GP interpolation are compared against the truth. A good agreement is seen. Naturally, the Root Mean Square Error (RMSE) and MAE for this finely sampled test data set is seen to be higher that the case for interpolating solely on the coarser sample locations (see Table 2 for comparison), but this adds a useful utility to this ROM strategy.
Metric  MAE  RMSE 

GP Interpolation+reconstruction  0.0175  0.00094 
Reconstruction  0.0084  0.00022 
6 Conclusions
The development of parameteric nonintrusive reducedorder models for advective PDE systems has great applications for cost reductions in large numerical simulation campaigns across multiple domain sciences. This article addresses their limitations associated with reduced interpretability by proposing the use of GPRs, conditioned on time and system control parameters that provide quantification of uncertainty. This is particularly useful when nonlinear compression techniques, such as autoencoders, are used for efficient DOF reduction. In addition to the ability to interpolate in initial condition space, we also investigate the ability of the proposed framework to interpolate in time. This addresses the fact that the sampling of a dynamical system for training these ROMs may not match the emulation temporal resolution requirement. We also remark that the modular nature of the compression and time evolution allows for the use of conventional reducedbasis methods such as the POD for dynamics which are intrinsically lowdimensional.
Our results indicate that the proposed modelorder reduction technique is successful at dealing with advective dynamics through assessments on the inviscid shallowwater equations. We establish this by testing on unseen initial conditions for our system evolution where a lowdimensional evolution successfully replicates highdimensional results when coupled with a convolutional or variational autoencoder. The nonintrusive nature of our framework also allows for construction of emulators from remotesensing or experimental data. This is of value when the underlying governing PDEs are not known a priori. Extensions to the present study shall investigate the integration of a feedback loop to sample points in control parameter space with the knowledge of prediction uncertainty. Through this, we aim to establish continuallylearning modelorder reduction frameworks for advective problems spanning large physical regimes.
7 Acknowledgements
RM acknowledges support from the Margaret Butler Fellowship at the Argonne Leadership Computing Facility. TB, LRM, IP acknowledge support from Wave 1 of The UKRI Strategic Priorities Fund under the EPSRC Grant EP/T001569/1, particularly the Digital Twins for Complex Engineering Systems theme within that grant and The Alan Turing Institute. IP acknowledges funding from the Imperial College Research Fellowship scheme. This material is based upon work supported by the U.S. Department of Energy (DOE), Office of Science, Office of Advanced Scientific Computing Research, under Contract DEAC0206CH11357. This research was funded in part and used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DEAC0206CH11357. This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the U.S. DOE or the United States Government. Declaration of Interests  None.
References
 [1] (2020) A long shortterm memory embedding for hybrid uplifted reduced order models. Physica D: Nonlinear Phenomena, pp. 132471. Cited by: §1.
 [2] (1993) The proper orthogonal decomposition in the analysis of turbulent flows. Annual Review of Fluid Mechanics 25 (1), pp. 539–575. Cited by: §3.1.
 [3] (2006) Pattern recognition and machine learning. springer. Cited by: §4.
 [4] (2004) Aerodynamic data reconstruction and inverse design using proper orthogonal decomposition. AIAA journal 42 (8), pp. 1505–1516. Cited by: §1.
 [5] (2017) Galerkin v. leastsquares petrov–galerkin projection in nonlinear model reduction. Journal of Computational Physics 330, pp. 693–734. Cited by: §1.

[6]
(2011)
Efficient nonlinear model reduction via a leastsquares Petrov–Galerkin projection and compressive tensor approximations
. Int. J. Numer. Meth. Eng. 86 (2), pp. 155–181. Cited by: §1.  [7] (2019) Datadriven discovery of coordinates and governing equations. arXiv preprint arXiv:1904.02107. Cited by: §1.
 [8] (2020) Predicting clustered weather patterns: a test case for applications of convolutional neural networks to spatiotemporal climate data. Scientific Reports 10 (1), pp. 1–13. Cited by: §1.
 [9] (2019) Analog forecasting of extremecausing weather patterns using deep learning. arXiv preprint arXiv:1907.11617. Cited by: §1.

[10]
(2017)
GPflow: a gaussian process library using tensorflow
. The Journal of Machine Learning Research 18 (1), pp. 1299–1304. Cited by: §4.  [11] (2013) Nonlinear petrov–galerkin methods for reduced order hyperbolic equations and discontinuous finite element methods. Journal of Computational Physics 234, pp. 540–559. Cited by: §1.
 [12] (2018) Learning lowdimensional feature dynamics using deep convolutional recurrent autoencoders. arXiv preprint arXiv:1808.01346. Cited by: §1.
 [13] (2018) Nonintrusive reduced order modeling of nonlinear problems using neural networks. Journal of Computational Physics 363, pp. 55–78. Cited by: §1.
 [14] (1998) Applying neural network models to prediction and data analysis in meteorology and oceanography. Bulletin of the American Meteorological Society 79 (9), pp. 1855–1870. Cited by: §1.
 [15] (2010) On the stability and convergence of a galerkin reduced order model (rom) of compressible flow with solid wall and farfield boundary treatment. International journal for numerical methods in engineering 83 (10), pp. 1345–1375. Cited by: §1.
 [16] (2007) An intrinsic stabilization scheme for proper orthogonal decomposition based lowdimensional models. Phys. Fluids 19 (5), pp. 054106. Cited by: §1.
 [17] (2019) Deep fluids: a generative network for parameterized fluid simulations. In Computer Graphics Forum, Vol. 38, pp. 59–70. Cited by: §1.
 [18] (2013) Autoencoding variational bayes. arXiv preprint arXiv:1312.6114. Cited by: §3.3.
 [19] (2018) Datadriven spectral analysis of the koopman operator. Applied and Computational Harmonic Analysis. Cited by: §1.
 [20] (1943) Statistics in function space. J Indian Math. Soc. 7, pp. 76–88. Cited by: §3.1.
 [21] (2017) Layered reducedorder models for nonlinear aerodynamics and aeroelasticity. Journal of Fluids and Structures 68, pp. 174–193. Cited by: §1.
 [22] (2016) Dynamic mode decomposition: datadriven modeling of complex systems. SIAM. Cited by: §1.
 [23] (2016) Multiresolution dynamic mode decomposition. SIAM Journal of Applied Dynamical Systems 15 (2), pp. 713–735. Cited by: §2.
 [24] (2020) Model reduction of dynamical systems on nonlinear manifolds using deep convolutional autoencoders. J. Comp. Phys. 404, pp. 108973. Cited by: §1.
 [25] (2014) Automatic construction and naturallanguage description of nonparametric regression models. stat 1050, pp. 24. Cited by: §4.
 [26] (2017) PDEnet: Learning PDEs from data. arXiv preprint arXiv:1710.09668. Cited by: §1.

[27]
(2014)
Nonlinear aeroelastic reduced order modeling by recurrent neural networks
. Journal of Fluids and Structures 48, pp. 103–121. Cited by: §1.  [28] (2020) Recurrent neural network architecture search for geophysical emulation. arXiv preprint arXiv:2004.10928. Cited by: §1.
 [29] (2020) Nonautoregressive timeseries methods for stable parametric reducedorder models. arXiv preprint arXiv:2006.14725. Cited by: §1.
 [30] (2020) Timeseries learning of latentspace dynamics for reducedorder model closure. Physica D: Nonlinear Phenomena, pp. 132368. Cited by: §1.
 [31] (2019) Dimensionality reduction and reduced order modeling for traveling wave physics. arXiv preprint arXiv:1911.00565. Cited by: §2.
 [32] (2019) Compressed Convolutional LSTM: An Efficient Deep Learning framework to Model High Fidelity 3D Turbulence. arXiv preprint arXiv:1903.00033. Cited by: §1.
 [33] (2018) A deep learning based approach to reduced order modeling for turbulent flow control using LSTM neural networks. arXiv preprint arXiv:1804.09269. Cited by: §1.
 [34] (2019) Physically constrained datadriven correction for reducedorder modeling of fluid flows. Int. J. Numer. Meth. Fl. 89 (3), pp. 103–122. Cited by: §1.
 [35] (2011) Reducedorder modelling for flow control. Vol. 528, Springer Science & Business Media. Cited by: §1.
 [36] (2019) Physicsinformed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics 378, pp. 686–707. Cited by: §1.
 [37] (2018) Deep hidden physics models: deep learning of nonlinear partial differential equations. The Journal of Machine Learning Research 19 (1), pp. 932–955. Cited by: §1.
 [38] (2018) Koopmanbased approach to nonintrusive projectionbased reducedorder modeling with blackbox highfidelity models. AIAA Journal 56 (10), pp. 4087–4111. Cited by: §1.
 [39] (2014) Basis selection and closure for pod models of convection dominated boussinesq flows. In 21st International Symposium on Mathematical Theory of Networks and Systems, Vol. 5. Cited by: §1.
 [40] (2018) Neural network closures for nonlinear model order reduction. Advances in Computational Mathematics 44 (6), pp. 1717–1750. Cited by: §1.
 [41] (2018) DGM: a deep learning algorithm for solving partial differential equations. Journal of Computational Physics 375, pp. 1339–1364. Cited by: §1.
 [42] (2019) Modal analysis of fluid flows: applications and outlook. arXiv preprint arXiv:1903.05750. Cited by: §3.1.
 [43] (2019) Deep learning methods for ReynoldsAveraged Navier–Stokes simulations of airfoil flows. AIAA J, pp. 1–12. Cited by: §1.
 [44] (1997) Direct numerical simulation of turbulence at lower costs. Journal of Engineering Mathematics 32 (23), pp. 143–159. Cited by: §1.
 [45] (2018) Datadriven forecasting of highdimensional chaotic systems with long shortterm memory networks. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 474 (2213), pp. 20170844. Cited by: §1.
 [46] (2020) Recurrent neural network closure of parametric podgalerkin reducedorder models based on the morizwanzig formalism. Journal of Computational Physics, pp. 109402. Cited by: §1.
 [47] (2012) Proper orthogonal decomposition closure models for turbulent flows: a numerical comparison. Computer Methods in Applied Mechanics and Engineering 237, pp. 10–26. Cited by: §2.
 [48] (2006) Gaussian processes for machine learning. Vol. 2, MIT press Cambridge, MA. Cited by: §4.
 [49] (2013) Nonlinear petrov–galerkin methods for reduced order modelling of the navier–stokes equations using a mixed finite element pair. Computer Methods In Applied Mechanics and Engineering 255, pp. 147–157. Cited by: §1.
 [50] (2019) Multilevel convolutional autoencoder networks for parametric prediction of spatiotemporal dynamics. arXiv preprint arXiv:1912.11114. Cited by: §1.
 [51] (2016) Nonlinear aerodynamic reducedorder model for limitcycle oscillation and flutter. AIAA Journal, pp. 3304–3311. Cited by: §1.
Comments
There are no comments yet.