The computational challenges faced during high-fidelity
numerical simulations of engineering systems governed by nonlinear partial differential equations (PDEs), especially in applications involving control[PBK2016], optimal design and multi-fidelity optimization [PWG2016], can often be mitigated by the development of reduced order models (ROMs)[Benner_Gugercin_etal_15].
Proper orthogonal decomposition (POD) [Sirovich_87, BHL1993] is a well known method for extracting a solution-dependent reduced basis space from a set of well-resolved, high-fidelity snapshots, and is most effective when the coherent structures of the dataset can be ranked in terms of their energy content. The POD method has been successfully applied in statistics [J1986]
, signal analysis and pattern recognition[DM2008], ocean models [VH2006], air pollution models [FZPPBN2014], convective Boussinesq flows [SB2015], and Shallow Water Equation (SWE) models [SSN2014, LFKG2016]. Alternatively, nonlinear dimension reduction techniques such as kernel POD [Salvador2021]
or deep learning-based approaches like autoencoders[Lusch2018, Ghorbanidehno2021]
have also been used for extracting a reduced basis. Combining autoencoder-generated bases with various specialized machine learning algorithms for time series modeling result in fully non-intrusive reduced order models[Gonzalez_Balajewicz_2018, Eivazi2020, Maulik2021]. Hybrid methods [Lee2020, Kim2020] can also be obtained by combining a nonlinear manifold learning technique like autoencoder for discovering the latent space with an intrusive method for the temporal dynamics.
Following the identification of the latent space, a reduced representation of the dynamical system is obtained by a Galerkin or Petrov-Galerkin projection on to the latent space [LFKG2016, LFK2017], which typically involves intrusive
modifications of the high-fidelity system operators. This work focuses on non-intrusive reduced order models (NIROMs) that do not require any knowledge of the high-fidelity simulator. In such a framework, the evolution of the expansion coefficients in the latent space is usually computed by the application of different regression-based methods directly on the high-fidelity data. These include artificial neural networks (ANNs), in particular multi-layer perceptrons[HU2018], Gaussian process regression (GPR) [GH2019], and radial basis function (RBF) [ADN2013] interpolation. RBF interpolation in particular has been shown to be quite successful for nonlinear, time-dependent partial differential equations (PDEs) [DFPSP2021], nonlinear, parametrized PDEs [ADN2013], and aerodynamic shape optimization [IQ2013].
Alternatively, in deep neural networks (DNN) such as ResNet, the evolution of features over the depth of the network is equivalent to solving an ordinary differential equation (ODE) of the form with the forward Euler method [Ruthotto2019]. With this connection in mind, [Chen2018] proposed a ’continuous-depth’ neural network called ODE-Net that effectively replaced the layers in ResNet-like architectures with a trainable ODE solver. This neural ordinary differential equation approach (NODE) was further improved in [Gholami2019, Dupont2019] and [Finlay2020] proposed a NODE generative model that can be efficiently trained on large-scale datasets. Some applications of the NODE framework include latent space closure modeling [Maulik2020], ODE/PDE model identification [Sun2020], modeling of irregularly spaced time series data [RCD2019], and modeling of spatio-temporal information in video signals [KVKP2019].
Dynamic mode decomposition (DMD) is yet another method for obtaining a reduced order model. DMD represents the temporal dynamics of a complex, nonlinear system [Schmid2010, KBBP2016]
as the combination of a few linearly evolving, spatially coherent modes that oscillate at a fixed frequency, and which are closely related to the eigenvectors of the infinite-dimensional Koopman operator[K1931]. Several variants of the DMD algorithm have been proposed [PBK2016, KFB2016, ABBN2016, LV2017] and have been successfully applied as efficient ROM techniques for determining the optimal global basis modes for nonlinear, time-dependent problems [DMD, Bistrian2017]. For non-parametrized PDEs, DMD presents an efficient framework that combines all the three stages of ROM development to learn a linear operator in an optimal least square sense. However, this approach cannot be directly applied to parametrized problems [Alsayyari2021].
In [Dutta2021-AAAI], we explored propagating the dynamics of a latent space formed from POD modes with neural ODEs. The present work investigates substituting the latent-space described by POD modes with one learned by an autoencoder. Our results for the combined autoencoder-NODE approach are compared to other methods like - a) dimension reduction by POD modes with latent space temporal dynamics captured by Neural ODEs (POD-NODE), b) dimension reduction via POD and temporal evolution of the latent space with Radial Basis Functions (POD-RBF), and c) Dynamic Mode Decomposition (DMD), which serve as benchmarks in our numerical experiments. The performance of each approach will be evaluated on sample problems based on incompressible flow around a cylinder and shallow water hydrodynamics in the context of fast replay applications for complex fluid-dynamics problems.
The standard ROM development framework can be divided into three stages:
identification of a low-dimensional latent (or reduced-order) space,
representation of the nonlinear dynamical system in terms of the reduced basis and modeling the evolution of the system of modal coefficients, and
reconstruction in the high-fidelity space for validation and analysis.
In this work, the dominant features of the unsteady flow-field have been extracted using a linear modal decomposition technique, POD, and a nonlinear manifold learning method that relies on deep, fully-connected, multi-layer perceptron (MLP) autoencoders that are highly expressive and scalable [HR2006]. POD is a popular technique for dimension reduction of the solution manifold of a dynamical system by determining a linear reduced space spanned by an orthogonal basis with an associated energetic hierarchy, and which represents an optimal approximation of the solution manifold with respect to the -norm. Given a matrix of high-fidelity system snapshots
and a matrix of orthonormal POD basis vectors, the modal coefficient matrix constitutes our training data for the latent space learning methods. [Taira2020] provides an excellent overview of POD as well as a comparison with other dimension-reduction techniques.
An autoencoder is a type of feedforward neural network that is designed to learn the identity mapping, such that and . This is accomplished using a two-part architecture. The first part is called an encoder, , defined by where (), which maps a high-dimensional input vector to a low-dimensional latent vector .
The second part is called a decoder, , defined as , which maps the latent vector to an approximation of the high-dimensional input vector . The combination of the two parts yields an autoencoder of the form
This autoencoder is trained by computing the optimal values of the parameters that minimize the reconstruction error over all the training data
where is a chosen measure of discrepancy between and its approximation . The restriction forces the autoencoder model to learn the salient features of the input data via compression into a low-dimensional space and then reconstructing the input, instead of directly learning the identity function. It is worth noting that with the choice of a linear, single-layer encoder of the form , and a linear, single-layer decoder of the form , where ,
, and a squared reconstruction error as the loss function, the autoencoder model has been shown to learn the same subspace as that spanned by the first POD modes if . However, additional constraints are necessary to ensure that the columns of form an orthonormal basis and follow an energy-based hierarchical ordering [P2018].
In this work, autoencoders are employed to generate separate low-dimensional latent representations of the pressure (depth) and the velocity snapshot data obtained from the high-fidelity simulation of the numerical examples. The encoder and decoder neural networks are constructed using fully-connected MLP architectures, as depicted in Figure 1. As the high-fidelity simulation data is usually available on a two-dimensional spatial grid, the data is first flattened and then fed to the autoencoder model.
Latent space evolution
In this section, we outline the non-intrusive framework for modeling the evolution of time-series data in the latent space. The first method employs RBF interpolation which is a classical, data-driven, kernel-based method for computing an approximate continuous response surface that aligns with the given multivariate data. More details about the POD-RBF NIROM framework can be found in [DFPSP2021]. The second technique called NODE is a neural-network based method to predict the continuous evolution of a vector over time, that is designed to preserve memory effects within the architecture.
Neural ordinary differential equations
We assume that the time evolution of the modal coefficients of the high-fidelity dynamical system in the latent space can be modeled using a (first-order) ODE,
The goal is to obtain a NN approximation of the dynamics function such that . The full procedure can be outlined as follows:
Compute the time series of modal coefficients for where .
Initialize a NN approximation for the dynamics function where represents the initial NN parameters.
The NN parameters are optimized iteratively through the following steps.
Compute the approximate forward time trajectory of the modal coefficients by solving eq. (3) using a standard ODE integrator as,
The free parameters of the NODE model are . Evaluate the differentiable loss function .
To optimise the loss, compute gradients with respect to the free parameters. Similar to the usual backpropagation algorithm, this can be achieved by first computing the gradient, and then a performing a reverse traversal through the intermediate states of the ODE integrator. For a memory-efficient implementation, the adjoint method [Chen2018] can be used to backpropagate the errors by solving an adjoint system for the augmented state vector backwards in time from to .
The gradient computed in the previous step is used to update the parameters
by using an optimization algorithm like RMSProp or Adam.
The trained NODE approximation of the dynamics function can be used to compute predictions for the time trajectory of the modal coefficients.
Following [Dutta2021-AAAI], we adopt the TFDiffEq (https://github.com/titu1994/tfdiffeq
) library that runs on the Tensorflow Eager Execution platform to train the NODE models. RMSProp is adopted for loss minimization with an initial learning rate of, a staircase decay function with a range of variable decay schedules, and a momentum coefficient of .
As a final point of comparison, we consider the standard Dynamic mode decomposition (DMD)[Schmid2010, KBBP2016] algorithm, which is a powerful data-driven method capable of providing an accurate decomposition of a complex system into spatiotemporal coherent structures that may be used for short-time future-state prediction.
In this section, we first assess the use of autoencoders for building a reduced space in which the system dynamics are propagated by NODE for a benchmark flow problem characterized by the incompressible Navier Stokes equations (NSE). We compare the performance of this framework with a POD-NODE method where NODE is employed to capture the evolution of modal coefficients in a reduced space defined by a POD basis. Moreover, we also examine the relative performance of different NIROM models for a real-world estuarine flow application governed by the shallow water equations (SWE). The POD-RBF and DMD NIROM training runs were performed on a Macbook Pro 2018 with a GHz 6-Core Intel Core i9 processor and 32 GB 2400 MHz DDR4 RAM. The autoencoder latent-space representations were trained on Vulcanite, a high performance computer at the U.S. Army Engineer Research and Development Center DoD Supercomputing Resource Center (ERDC-DSRC). Vulcanite is equipped with NVIDIA Tesla V100 PCIe GPU accelerator nodes and has 32GB memory/node. Training for the NODE models was performed in serial on Jim, a high performance computer at the U.S. Army Engineer Research and Development Center Coastal and Hydraulics Lab (CHL), which is equipped with 2 Intel Xeon E5-2699 v3 CPUs and 128Gbytes of memory/node.
Flow around a cylinder
The problem of two-dimensional, incompressible flow around a cylinder is a classical benchmark CFD example that simulates a time-periodic fluid flow through a 2D pipe with a circular obstacle. For further details about the problem setup please see [Dutta2021-AAAI]. High-fidelity simulation data is obtained with OpenFOAM using an unstructured mesh with nodes at , such that the flow exhibits the periodic shedding of von Karman vortices. training snapshots are collected for seconds with seconds, and the NIROM predictions are obtained for seconds with seconds.
linear, relu, …
|elu, tanh, …|
Several different architectures were explored for the autoencoder model by varying the model parameters like the dimension of the latent space, the activation functions, various configurations of the learning rate evolution during training, and Table1 shows four of the best architectures for the cylinder flow example. The second column shows the size of the latent space for each of the solution variables . The third and fourth columns list the activation functions used in the last hidden encoder and the last hidden decoder layers, respectively.
The fifth column shows the scaling applied to the input data which is directly determined by the activation function used in the decoder output layer. In general, it was found that activation functions that required scaled input data like sigmoid, tanh performed better for the decoder output layer than some of the (semi-)unbounded ones like linear, relu, and elu. However, unbounded activation functions like linear and elu were seen to generate a more accurate and efficient latent space. The encoder and the decoder networks were made up of four hidden layers with gradually decreasing and gradually increasing size, respectively, and characterized by the relu activation function, which were found to generate the least noisy latent representations. The Adam optimizer is used for training with an initial learning rate= and momentum=. An adaptive learning rate decay algorithm is employed that monitors the training loss and reduces the learning rate by a factor of 2 if no improvement is detected for epochs. The sixth column of Table 1 lists the total mean square reconstruction error for all three solution variables, while the last column shows the training times for each model on two NVIDIA Tesla V100 GPU nodes.
Figure 2 shows the temporal coefficients for the first two latent modes of using a POD truncation that captures of modal energy content, and the four chosen AE models. While the POD modal coefficients are arranged according to the decreasing order of amplitude, this cannot be guaranteed for the AE-generated spaces. AE2 and AE3 models define a latent space of dimension 2, whereas the AE4 model has 3 latent space units and the AE1 model is identical to the dimension of the POD bases for each variable - 5 (), 8 (), and 7(). The richer quality of the AE1 latent space leads to better expressivity which is reflected in the subtler features captured in the modal coefficients (Figure 2 row 2) and also in the lowest reconstruction error (Table 1).
Figure 3 shows the evolution of the training loss and the adaptive decay of the learning rate while training autoencoder models for using four gradually increasing latent space dimensions -
. The AE3 architecture was adopted for these runs. The optimization of the hyperparameters with respect to the training loss becomes gradually harder as the dimension of the latent space increases. So the models with smaller latent spaces initially show a faster reduction in training losses. However, the enhanced expressivity of the models with higher latent dimensions allow them to reach lower values of training loss after sufficient epochs of training, whereas the losses for the models with smaller latent dimensions appear to stagnate. Figure4 shows the training loss and the learning rate decay for the four chosen AE models. The AE1 and the AE4 models with the relatively richer latent spaces achieve the sharpest reduction in training loss, whereas the latent space dimensions for models AE2 and AE3 appear to be significant barriers in their training efficiency.
|Id||Lyrs||Units||Act.||LR steps, rate||Scaling||Aug.||MSE||Training|
|Range||1-4||32-512||tanh, elu,…||5k-25k, 0.1-0.9|
|AE1-NODE1||1||256||elu||10k, 0.3||No||No||2.30e-5||28.80 hrs|
|AE1-NODE2||1||256||tanh||5k, 0.7||Yes||No||1.34e-4||28.69 hrs|
|AE1-NODE3||1||512||elu||5k, 0.5||No||No||1.97e-5||29.17 hrs|
|AE1-NODE4||1||256||tanh||10k, 0.25||Yes||Yes||1.49e-4||28.27 hrs|
|AE1-NODE5||4||64||tanh||5k, 0.5||Yes||No||1.33e-4||33.08 hrs|
An extensive exploration of the NODE hyperparameter and architecture space for the cylinder example was reported in [Dutta2021-AAAI]. Based on those inferences and some further numerical study, five architectures were selected, and the results after training for 50000 epochs using the latent space data generated by the AE1 model are presented in Table 2. All the models generate accurate predictions at a finer temporal resolution than the training data, and have excellent agreement with the high-fidelity solution even while extrapolating outside the training data ( seconds).
The first row of Fig. 5 compares the time trajectory of the spatial root mean square errors (RMSE) in the high-fidelity space for NIROM solutions generated using the POD and the AE1 basis with the NODE3 and NODE5 configurations. It is encouraging to note that even though the latent-space sizes are identical between the POD and the AE1 basis, the AE1-NODE solutions are more accurate, confirming that the nonlinear AE basis generates a more accurate spatial compression than a linear POD basis of similar size. The second row of Fig. 5 compares the RMSE between the AE4-NODE3, POD-NODE3, and two DMD NIROM models with truncation levels of and . The AE4-NODE3 model achieves better accuracy than all the other models even while using the most compressed latent space (:2,3,3). The DMD(8) model is roughly comparable with the POD-NODE3 model owing to the similarity in their latent space dimensions, whereas the DMD(3) model fares the worst due to the lack of hierarchical ordering in its latent basis.
Shallow water equations
The final numerical example involves flows governed by the depth-averaged SWE which is written in a conservative residual formulation as
Here, the state variable consists of the flow depth, , and the discharges in the and directions, given by and , respectively. Further details about the flux vectors , and the high-fidelity model equations are available in [DFPSP2021]. The high-fidelity numerical solutions of the SWE are obtained using the 2D depth-averaged module of the Adaptive Hydraulics (AdH) finite element suite, which is a U.S. Army Corps of Engineers (USACE) high-fidelity, finite element resource for 2D and 3D dynamics [Trahan2018].
Tidal flow in San Diego bay
This numerical example involves the simulation of tide-driven flow in the San Diego Bay in California, USA.
The AdH high-fidelity model consists of nodes, uses tidal data obtained from NOAA/NOS Co-Ops website at a tailwater elevation inflow boundary and has no flow boundary conditions everywhere else. Further details are available in [DFPSP2021].
The training space is generated using high-fidelity snapshots obtained between minutes to hours at a time interval of seconds. The predicted ROM solutions are computed for the same time interval with seconds. A latent space of dimension is generated by using a POD truncation tolerance of for each solution component. The AE2 architecture designed for the cylinder example is modified by including BatchNormalization layers for each hidden layer, using full batches for training, and by enforcing a latent space of dimension for each solution component. The RBF NIROM approximation is computed using a shape factor, . The simulation time points provided as input to the NODE model are normalized to lie in . The ‘dopri5’ ODE solver is adopted for computing the hidden states both forward and backward in time. Learning from the conclusions of the cylinder example, a network consisting of a single hidden layer with neurons is deployed and the RMSProp optimizer with an initial learning rate of , a staircase decay rate of every epochs, and a momentum of is utilized for training the model over epochs. For the DMD NIROM, the simulation time points are normalized to an unit time step, and a truncation level of is used to compute the DMD eigen-spectrum.
Figure 10 compares the POD-NODE and the AE-NODE NIROM solution fields (top row) for at hours and the corresponding error plots (bottom row). It can be seen that even while using a latent dimension that is three times smaller than POD, the relative errors for the AE-NODE solution () is almost two times lower than that of the POD-NODE solution (). Figure 11 shows the spatial RMSE over time for the depth (left) and the x-velocity (right) NIROM solutions. The AE-NODE ( modes) solution has comparable accuracy to the POD-NODE ( modes) and the DMD ( modes) solution for the depth variable and actually outperforms the POD-NODE ( modes) solution for the x-velocity variable. Additionally, unlike the RBF NIROM solution, the AE-NODE solution does not exhibit any appreciable accumulation of error over time due to the higher-order time-stepping scheme adopted for NODE.
We have studied the combined use of autoencoders as a data-driven method for identifying an efficient reduced latent space that approximates the solution manifold of a system of nonlinear, time-dependent PDEs and neural ODEs as a non-intrusive method for modeling the evolution of the reduced latent space coefficients for the aforementioned dynamical system. Numerical experiments were carried out with a benchmark flow problem governed by the incompressible Navier Stokes equations and a real-world application of estuarine flow dynamics governed by the two-dimensional shallow water equations. The AE latent space representation was shown to provide a very high degree of efficient spatial compression, especially for the advection-dominated shallow water example. The POD-NODE NIROM formulation demonstrated a stable and accurate learning trajectory in modeling reduced basis dynamics, even in comparison to classical ROM techniques utilizing DMD, POD and RBF interpolation. The AE-NODE formulation also produced encouraging extrapolatory predictions for the flow around a cylinder example. This presents an exciting prospect for future exploration as even for an isolated system, unperturbed by unseen external forcings, truly extrapolative predictions of reduced order dynamics in flow regimes that do not correspond to the training data is a rare feature for most well-established ROM frameworks.
This study leads to several promising avenues of research. For instance, neural architecture search (NAS) tools can be adopted to perform an exhaustive exploration of the network architecture and model hyperparameter space for a wide range of flow dynamics in order to gain insight of the learning trajectory and to design more generalizable AE models with improved reconstruction accuracy. Moreover, integrating the process of identification of an AE-based latent space with the modeling of system dynamics using NODE may lead to significant performance improvements if the two independent learning problems can be designed to intelligently inform each other. All the relevant data and codes for this study will be made available in a public repository at https://github.com/erdc/aenode_nirom upon publication.
This research was supported in part by an appointment of the first author to the Postgraduate Research Participation Program at the U.S. Army Engineer Research and Development Center, Coastal and Hydraulics Laboratory (ERDC-CHL) administered by the Oak Ridge Institute for Science and Education through an interagency agreement between the U.S. Department of Energy and ERDC. Permission was granted by the Chief of Engineers to publish this information.