Using recurrent neural networks for nonlinear component computation in advection-dominated reduced-order models

09/18/2019 ∙ by Romit Maulik, et al. ∙ Argonne National Laboratory 0

Rapid simulations of advection-dominated problems are vital for multiple engineering and geophysical applications. In this paper, we present a long short-term memory neural network to approximate the nonlinear component of the reduced-order model (ROM) of an advection-dominated partial differential equation. This is motivated by the fact that the nonlinear term is the most expensive component of a successful ROM. For our approach, we utilize a Galerkin projection to isolate the linear and the transient components of the dynamical system and then use discrete empirical interpolation to generate training data for supervised learning. We note that the numerical time-advancement and linear-term computation of the system ensures a greater preservation of physics than does a process that is fully modeled. Our results show that the proposed framework recovers transient dynamics accurately without nonlinear term computations in full-order space and represents a cost-effective alternative to solely equation-based ROMs.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

High-fidelity simulations of systems characterized by nonlinear partial differential equations (PDEs) are computationally prohibitive for decision-making tasks in multiple engineering and geophysical applications. To address this issue, researchers have devoted significant effort to the reduced-order modeling (ROM) of such systems with the aim of reducing the degrees of freedom of the forward problem to manageable magnitudes. Therefore, ROMs find extensive application in uncertainty quantification, control, and multifidelity optimization. The interested reader is directed to [1] and references therein for an excellent treatise on ROM for engineering problems. A common ROM development procedure can be described by the following steps: (1) reduced basis identification, (2) system evolution in the reduced basis, and (3) reconstruction in full-order space for assessments.

In this paper, we utilize conventional ideas for reduced basis identification with the use of proper orthogonal decomposition (POD) for finding the optimal global basis (i.e., step 1) and focus on using recurrent neural networks for efficient and accurate system evolution in reduced space (i.e., step 2). Our test case is given by the viscous Burgers equation formulated for a moving shock problem [2]. We note that this one-dimensional PDE possesses a quadratic nonlinearity and is frequently utilized as a prototype for assessing numerical methods before their utilization in higher-dimensional phenomena. The governing PDE, initial and boundary conditions, and the analytical solution for this problem are given by


where is the viscosity and correspond respectively to the quadratic nonlinearity and the linear operator in the viscous Burgers equation. In this study, we seek to accelerate ROMs by performing a nonintrusive calculation of the nonlinear term , which leads to lower memory and compute cost requirements compared with traditional numerical techniques.

2 POD-Galerkin projection and discrete empirical interpolation method

The orthogonal nature of the POD bases may be leveraged for a Galerkin projection onto a linear subspace that hierarchically embeds information. We start by revisiting Equation (1) written in the form of a full-order evolution equation for fluctuation components using degrees of freedom:


It can expressed in the reduced basis as


where corresponds to the temporal coefficients at one time instant of the system evolution and represents the truncated bases of the POD modes obtained from snapshots of the full-order solution having degrees of freedom. The orthogonal nature of the reduced basis and the commutative property of the linear term can be leveraged to obtain


which we denote as the POD Galerkin-projection method (POD-GP). We remark that the calculation of necessitates a reprojection of the reduced-order solution to full-order space and subsequent nonlinear term calculation throughout the domain. This process can be extremely expensive for advection-dominated problems such as ours. For the purpose of comparison and for the generation of training data as well, we use the discrete empirical interpolation method (DEIM) [3], which reduces the number of nonlinear term calculations significantly. Our goal is to completely preclude nonlinear term computation in full-order space and to utilize DEIM as a bridge between POD-GP and our formalism. We denote this approach as POD-ML. The time advancement of this hybrid system is then performed by using a simple first-order Euler integrator (although this may easily be extended to higher-order methods such as Runge-Kutta or Adams-Bashforth techniques).

The DEIM procedure is outlined in the following. Let be the truncated POD basis matrix for snapshots of the nonlinear term , where

is the number of vectors in the truncated basis. DEIM calculates a unit-vector matrix

, specifying locations (out of the locations in the full-order space) where may be computed to construct an approximation for . Then we define a matrix as


and approximate Equation (4) as


Note that the linear operator and are precomputed, leading to a reduction in the cost of calculating the nonlinear term . Depending on , this costs significantly less than POD-GP. The exact algorithm utilized to obtain is a variant of least squares and may be found in [3]. Figure 1 shows a validation of our numerical method where both POD-GP and POD-DEIM are approximately equal in accuracy for our problem.

(a) POD-GP
Figure 1: Prediction utilizing 12 retained bases during truncation for the viscous Burgers equation. An identical performance validates the DEIM procedure.

3 Nonlinear surrogate using long short-term memory

We utilize a long short-term memory (LSTM) neural network to further reduce the computational complexity of DEIM by calculating the nonlinear term, namely, in Equation 6 nonintrusively. From this point, we denote the DEIM approximation for the nonlinear term as . Our training data is generated by calculating DEIM coefficients from the analytical solution (keeping and ) for 300 snapshots of the nonlinear term in time.

We utilized a standard LSTM network [4] with one cell,

hidden neurons, and a fully connected output layer to do many-to-one predictions for the nonlinear term

in a recursive fashion

. In other words, the outputs of the network are the DEIM nonlinear terms, which are reutilized as inputs for the next prediction. We used an input window of 10 to give an output of the nonlinear term at the next timestep, a batch size of 32, a learning rate of 0.001, and the ADAM optimizer. We observed that 60 epochs were sufficient for training the network under this setting. The training utilized a random 20% validation split and saved the best model according to this data. Testing was performed through deployment within the hybrid numerical and ML framework as explained below. Training and assessment were performed by using an Intel core-i7 CPU with a basic build of TensorFlow 1.13.

Figure 1(a) outlines the collected DEIM coefficients () from the analytical solution showing oscillations. To predict these coefficients appropriately, we used a Savitsky-Golay low-pass spatial filter to mimic the effect of numerical smoothing in spatial and temporal discretizations. Figure 1(b) shows the learning of the evolution of these coefficients (as an a priori assessment). The LSTM predictions then are added to the linear component of the governing equation to obtain an accurate state advancement as shown in Figure 1(c)

. Preprocessing of the oscillatory DEIM coefficients was crucial to the stability of the hybrid machine learning and numerical method.

-errors in modal evolution for each ROM technique (when compared withto the truth) were 0.064 for POD-GP, 0.059 for POD-DEIM, and 0.044 for POD-ML, indicating similar fidelity of the final solution. Assessments of computational cost are precluded here because for simple systems such as the Burgers equation, the cost of LSTM inference dominates the nonlinear computation. Benefits of the proposed method may be observed for larger systems with much greater degrees of freedom. In terms of complexity, for our problem, without any reduction, the per step complexity is nonlinear evaluations. POD-GP needs evaluations, where . POD-DEIM requires nonlinear evaluations, where , while the proposed method utilizes and no online nonlinear evaluations. For larger problems where typically , , and are of the same order of magnitude, our method will yield exceptional gains.

(a) DEIM coefficient preprocessing for stability
(b) DEIM coefficient prediction by LSTM
(c) POD-ML performance using LSTM for nonlinear term
Figure 2: Preprocessing of DEIM coefficients (top), a priori LSTM predictions for DEIM coefficients (middle), and a posteriori predictions for the modal coefficients where LSTM outputs are integrated with POD-based numerical method (bottom).

4 Conclusions

We have developed a method that precludes the reprojection of the ROM into full-order space for nonlinear term computation and, instead, utilizes prior knowledge of its evolution through the learning of precalculated DEIM coefficients. However, smoothing is required for the training data to mimic the stability-preserving nature of numerical methods. This requirement may also be true when the data comes from noisy physical or numerical experiments. The results here suggest that ROMs can be improved significantly for quick evaluation of extremely expensive physical systems with excellent retention of accuracy through the adoption of sequence learning techniques.


This material is based upon work supported by the U.S. Department of Energy (DOE), Office of Science, Office of Advanced Scientific Computing Research, under Contract DE-AC02-06CH11357. This research was funded in part and used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357. This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the U.S. DOE or the United States Government.


[1] Taira, K., Brunton, S. L., Dawson, S. T., Rowley, C. W., Colonius, T., McKeon, B. J., & Ukeiley, L. S. (2017). Modal analysis of fluid flows: An overview. AIAA Journal, 4013–4041.

[2] Maulik, R., Mohan, A., Lusch, B., Madireddy, S., Balaprakash, P & Livescu D. (2019). Time-series learning of latent-space dynamics for reduced-order model closure. arXiv preprint arXiv:1906.07815.

[3] Chaturantabut, S., & Sorensen, D. C. (2010). Nonlinear model reduction via discrete empirical interpolation. SIAM Journal on Scientific Computing, 32(5), 2737–2764.

[4] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735–1780.