Bifidelity data-assisted neural networks in nonintrusive reduced-order modeling

02/01/2019
by   Chuan Lu, et al.
The University of Iowa
0

In this paper, we present a new nonintrusive reduced basis method when a cheap low-fidelity model and expensive high-fidelity model are available. The method relies on proper orthogonal decomposition (POD) to generate the high-fidelity reduced basis and a shallow multilayer perceptron to learn the high-fidelity reduced coefficients. In contrast to other methods, one distinct feature of the proposed method is to incorporate the features extracted from the low-fidelity data as the input feature, this approach not only improves the predictive capability of the neural network but also enables the decoupling the high-fidelity simulation from the online stage. on the low-fidelity simulation cost, which is typically small. Due to its nonintrusive nature, it is applicable to general parameterized problems. We also provide several numerical examples to illustrate the effectiveness and performance of the proposed method.

READ FULL TEXT VIEW PDF

Authors

page 16

08/28/2019

A multi-fidelity neural network surrogate sampling method for uncertainty quantification

We propose a multi-fidelity neural network surrogate sampling method for...
03/15/2021

A FOM/ROM Hybrid Approach for Accelerating Numerical Simulations

The basis generation in reduced order modeling usually requires multiple...
02/26/2021

Multi-fidelity regression using artificial neural networks: efficient approximation of parameter-dependent output quantities

Highly accurate numerical or physical experiments are often time-consumi...
10/03/2020

MFPC-Net: Multi-fidelity Physics-Constrained Neural Process

In this work, we propose a network which can utilize computational cheap...
01/30/2017

Estimating the risk associated with transportation technology using multifidelity simulation

This paper provides a quantitative method for estimating the risk associ...
11/13/2019

Coarse-Proxy Reduced Basis Methods for Integral Equations

In this paper, we introduce a new reduced basis methodology for accelera...
05/27/2021

Neural Network Training Using ℓ_1-Regularization and Bi-fidelity Data

With the capability of accurately representing a functional relationship...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Parameterized partial differential equations (PDEs) arise in many complex scientific and engineering applications. A common task in such applications requires solving the underlying PDE efficiently and accurately for a large number of parameter points in the parameter space, which poses a huge computational challenge, particularly for large-scale problems. To address this challenge, reduced-order modeling or model reduction techniques

[28, 14] have been proven to be successful for many practical problems with low intrinsic dimension, including electromagnetic scattering [7], multiscale simulations [16], and uncertainty quantification [6], to name a few.

As one of major model reduction techniques, classical projection-based model reduction algorithms, such as proper orthogonal decomposition (POD) based method and reduced basis method (RBM) [31], generally follows an offline-online paradigm [23]. At the offline stage, a set of reduced basis is built from a collection of full-order simulation results. During the online stage, for a new parameter value, the reduced model is constructed as a linear combination of the pre-computed reduced basis, where the expansion coefficients are computed by projecting the full-order equation onto the reduced approximation space [3]

. Despite its success for many applications, this coupling between the full-order model and the online stage requires major rewrites of the sophisticated original legacy solver of the full order model. Additionally, it causes computational inefficiency for nonlinear problems, where the online computational complexity of the nonlinear term remains high due to its dependence on the degrees of freedom (DOFs) of the full-order solution, instead of the dimension of reduced approximation space.

To tackle these challenges, non-intrusive methods such as [38, 37]

have been developed to construct a surrogate of the high-fidelity reduced coefficients so that the reduced coefficients can be recovered without requiring a projection of the full-order model. More specifically, the full-order simulations are only required for basis generation during the offline stage. During the online stage, the reduced coefficients are recovered by interpolation over the parameter space. Even though this approach enables decoupling between the full-order model and the online stage, the interpolation approaches are prone to fail in complex applications, where the reduced coefficient has a highly nonlinear dependency on the parameters

[2].

In a recent work [15], a novel non-intrusive reduced basis method combining POD and neural network was proposed, referred to as POD-NN. In this method, neural networks, especially a shallow multi-layer perceptron (MLP) is used to approximate the POD coefficients of high-fidelity data. Instead of using a nonadapted interpolation basis set, neural networks provide a data-driven approach to approximate the mapping from the physical model parameter space to the high-fidelity reduced coefficient space. For nonlinear parameterized problems, it shows a superior performance compared to traditional POD-Galerkin based methods. Nevertheless, it is important to note that the input features of POD-NN are the physical model parameters, which are not data-dependent and might not be strongly informative. While it would be desirable to utilize the input feature from the high-fidelity data set to train the network at the offline stage, this is still problematic since the online stage still requires a full-order (high-fidelity) simulation to generate the high-fidelity feature. Therefore, utilizing high-fidelity features is not practical for many scientific and engineering applications with expensive high-fidelity solvers.

It is worth noting that the aforementioned methods are based on single fidelity solver of the full-order model, which is usually computationally expensive. In many practical problems, there often exists models with different fidelities [1, 35, 9]. Although the low-fidelity models are inaccurate, they can still mimic important behaviors of the underlying problem with a much lower computational cost. Many recent works in computational science and engineering community suggested that it is advantageous to combine low-fidelity and high-fidelity models together to improve computational accuracy or efficiency under different contexts, such as uncertainty quantification [24, 42, 41, 26, 13, 27, 39], optimization [30, 29, 22]

. Much of multifidelity research work in machine learning have been focused on optimization setting

[8, 20, 34, 18] as well as the ensemble learning [11, 40]. In general, the main differences between those multifidelity algorithms lie in the problem setting and the way to integrate the multifidelity model or data together.

Motivated by the recent developments in multifidelity modeling and reduced order modeling [15], we propose a new nonintrusive reduced basis method when a cheap low-fidelity model and expensive high-fidelity model are available. In this work, we train a two-hidden-layer perceptron to approximate the high-fidelity reduced coefficients, referred to as Bi-Fidelity data-assisted Neural Network (BiFi-NN). More specifically, besides using the original physical model parameter itself as the input feature suggested in [15], we also augment additional features extracted from the cheap low-fidelity data as the input feature. These new features are not only data-dependent, but also encode more prior information into neural networks. Additionally, it decouples the high-fidelity simulations from the online stage. This provides a competitive alternative to existing nonintrusive reduced basis methods that produce accurate reduced solutions with an affordable online computational cost.

The paper is organized as follows. Section 2 introduces the setup of the problem. Section 3 briefly reviews the POD-NN method from [15], then introduces our proposed method - BiFi-NN and discuss the error contribution of the proposed method. In Section 4, we illustrate the effectiveness of the proposed BiFi-NN algorithm via several numerical examples. We conclude the paper in Section 5.

2 Problem Setup

For simplicity of presentation, we consider the following parameterized PDEs:

(1)

where is the physical domain and the parameter with represents the either the physical parameters or uncertain parameters in the model.

Assume that we have two different models for (1) available: a high-fidelity solution and a low-fidelity solution , where is the discrete approximation space for the high-fidelity solution and is the approximation space for low-fidelity solution. Typically, the dimension of is large in order to resolve the details of the system with high accuracy, while is parameterized with fewer degrees of freedom than the high-fidelity space, i.e., . Consequently, the high-fidelity solutions are more accurate and expensive to simulate, while the low-fidelity solutions are cheap to simulate. Even though the accuracy of low-fidelity solution is typically lower than the high-fidelity solution, it is still capable of capturing the important features of the underlying problem.

In this paper, our goal is to efficiently construct a nonintrusive reduced model that is able to produce an accurate approximation of high-fidelity solution with affordable online computational cost, particularly for nonlinear problems. To achieve this, we shall take advantage of the predictive capability of the neural network and the data from the low-fidelity and high-fidelity models.

3 Method

To set the stage for the later discussion, we first briefly review the POD-NN algorithm proposed in [15], then discuss the proposed bi-fidelity data-assisted neural network approximation - the BiFi-NN method and a modification of the POD-NN method as a reference solution. Finally, we shall discuss the error contributions.

3.1 Pod-Nn

The POD-NN method proposed in [15] basically consists of the following steps:

  1. Select a subset of parameters . For each sample , perform the corresponding high-fidelity full-order simulation to generate the snapshot matrix

    then construct the high-fidelity POD reduced basis set .

  2. Select a subset of parameters , independent of . For each sample , perform a high-fidelity full-order simulation to get the high-fidelity solution . Then compute the first POD coefficients on the reduced approximation space.

  3. Construct and train a multi-layer perceptron with the training set collected in the previous step, where the input is the parameter

    and the output is the high-fidelity POD coefficient vector

    .

  4. At the online stage, for a new parameter value , evaluate the trained network to predict the corresponding reduced coefficients and then compute the reduced solution.

In the following, we shall discuss each step of the POD-NN algorithm in detail.

3.1.1 The proper orthogonal decomposition

In this section, we shall briefly introduce the proper orthogonal decomposition (POD) [4], which is the building block of the POD-NN method. As one of the most widely used methods to generate reduced basis, the general idea of POD is to seek for a set of parameter-independent function basis for the low-dimensional representation of the full-order solution space . Assume we have a collection of high-fidelity solutions at a selected set of parameter points available, which is assumed to be large and rich enough to represent the parameter space. We concatenate the high-fidelity solutions as the following snapshot matrix:

(2)

Each column is a high-fidelity solution at the parameter , of length , reflecting the degrees of freedom of the high-fidelity solution. With this snapshot matrix

, a singular value decomposition is then performed to reveal the reduced space:

(3)

where and are orthogonal matrices. The diagonal matrix satisfies . is the rank of the snapshot matrix. The first columns of are chosen to form the basis of the reduced space , i.e.,

(4)

Under the context of reduced-order modeling, we assume . Once the reduced basis set is available, the reduced representation of the high-fidelity snapshot can be written in the following form:

(5)

where the high-fidelity POD coefficient vector can be computed by projecting the full-order snapshot onto the reduced approximation space :

(6)

where the basis matrix .

We shall emphasize that traditional projection-based reduced basis methods need to project the full-order equation into the reduced approximation space to recover the high-fidelity POD coefficient vector for a new given . We refer interested readers to [28, 14] for more details of the traditional projection-based reduced basis methods. In contrast, the basic idea of POD-NN is to construct a surrogate of high-fidelity POD coefficients by a neural network so that the projection of the high-fidelity model is not needed during the online stage.

3.1.2 Neural Network

The second key ingredient of POD-NN is the neural network approximation. Neural network is a universal function approximation model with the capability to learn from any type of observed data, thus provides an alternative to traditional function approximation methods [33]. One widely-used class of neural networks is the feedforward network, which is also called multi-layer perceptrons (MLPs) [17]

. It consists of a collection of layers, including an input layer, an output layer, and a number of hidden layers. Each hidden layer contains a certain number of neurons, called hidden units, and a nonlinear activation function. For traditional MLPs, the connection between two continuous layers is an affine function defined by a set of weights and biases.

Figure 1 illustrates the structure of a two-hidden-layer MLP.The circles represent the neurons in input, hidden and output layers, and information flows from left to right.

Figure 1: The structure of a two-hidden-layer neural network with three input nodes and two output nodes.

As a crucial step to construct the neural network, the training of the neural network generally involves gradient-based optimization. The gradient information is usually computed through backpropagation. Widely used gradient-based training algorithms for large networks includes stochastic gradient descent, RMSprop, and Adam

[32].

3.1.3 POD-NN Algorithm

With the two building blocks discussed in the previous section, we discuss the basics of POD-NN in this section. POD-NN in [15] employed a shallow neural network to build a surrogate of high-fidelity POD coefficient. The input of the network is the parameter and the output is the high-fidelity POD coefficient vector . It consists of two dense hidden layers with the same number of hidden units, whose structure is illustrated in Figure 2.

The resulted POD-NN approximation of the high-fidelity POD coefficient vector can be represented as follows:

(7)

where is the trained neural network approximation for the coefficient and is the parameter of the neural network.

Once the neural network is trained, we perform a forward pass to predict the reduced coefficient for a new given , during the online stage. The corresponding reduced solution is given by

(8)

where is the high-fidelity POD basis set. We emphasize that the key advantage of this approach is that the high-fidelity model is completely decoupled from the online stage.

This completes the description of the original POD-NN algorithm. The detailed steps of the corresponding offline and online algorithms are summarized in Algorithm 1 and Algorithm 2. We refer interested readers to [15] for more details of POD-NN.

Figure 2: The network structure of the POD-NN method. The input is the parameter vector and the output is the high-fidelity POD coefficient vector .
Sample a collection of parameters . Run the high-fidelity model for each .Compute the POD coefficient vector for each high-fidelity snapshot by projection:
Train a network with the input and the output .
Algorithm 1 Offline Stage for POD-NN
Evaluate the trained network at the given parameter to predict the high-fidelity POD coefficient Compute the POD-NN approximation of the high-fidelity solution for the given :
Algorithm 2 Online Stage for POD-NN

3.2 Bi-Fidelity Data-Assisted Neural Network

As we discussed in the previous section, POD-NN focuses on learning the map from the parameter space to the high-fidelity POD coefficient space . Nevertheless, from the view of the conventional wisdom in the machine learning community, the parameter as the input feature is not data-independent and might not be strongly informative nor encode much prior knowledge.

Alternatively, since low-fidelity models are cheap to evaluate and can capture some important information of the underlying physical systems, it would be desirable to use low-fidelity data to extract useful features to improve the predictive performance of the neural network approximation. In this paper, we propose a novel bi-fidelity data-assisted neural network approximation (BiFi-NN) by modifying the original POD-NN algorithm: during the offline stage, we first employed the POD coefficient of the low-fidelity model as the augmented data-dependent features, and train the neural network to predict high-fidelity POD coefficient . In contrast to the original POD-NN, the learned mapping is now from the combined feature space (the original parameter space and the low-fidelity coefficient space to the high-fidelity POD coefficients space . During the online stage, it only requires one cheap low-fidelity simulation run to generate the low-fidelity POD coefficients to predict the high-fidelity POD coefficients from the pre-trained neural network. Such an approach not only allows us to incorporate more relevant features to improve the predictive performance, but also remove the dependence on the high-fidelity solver from the online stage.

In the following, we shall present the details of the offline and online stage of the proposed BiFi-NN algorithm.

3.2.1 Offline Stage

There are two major steps at offline stage: (1) prepare the data. (2) train the neural network. We shall detail each step in this section.

Data Preparation. We first sample a collection of parameters and run the corresponding low-fidelity and high-fidelity simulations. Based on the acquired simulation data, we compute the corresponding POD basis set and . To prepare the training input-output pairs for the next step, we run low-fidelity and high-fidelity simulations over a sample set , independent of . And then we compute the corresponding low-fidelity and high-fidelity POD coefficients: , , .

Architecture and Training of BiFi-NN. Once the data is available, we shall construct a surrogate for the high-fidelity POD coefficients. More specifically, we shall train neural networks to predict the components of the high-fidelity coefficient separately. The structure of the network consisting of two hidden layers is illustrated in Figure 3, where the numbers of hidden units in both layers are equal, and the training input is given by concatenating the parameter and the low-fidelity POD coefficient vector together, i.e.,

(9)

and the output is the component of the high-fidelity POD coefficient We remark that the structure of BiFi-NN is similar with POD-NN. However, there are two major differences between BiFi-NN and POD-NN: (1) We incorporate additional new features from the low-fidelity data in order to improve the predictive power of the neural network (2) Instead of training a single neural network to predict all components of the high-fidelity coefficient (referred to as the joint approach), we train neural networks to predict each component of the high-fidelity coefficient separately, whose predictive accuracy should be comparable to the joint approach. Nevertheless, BiFi-NN is expected to be relatively easy to train and more memory-efficient due to the compact configuration (single output node for each network).

Figure 3: The network structure of the net of the proposed BiFi-NN method. The input is the concatenation of the parameter vector and the low-fidelity POD coefficient , and the output is the component of high-fidelity POD coefficient .

Consequently, the BiFi-NN approximation of the component of the high-fidelity POD coefficient can be represented as follows:

(10)

where is the trained neural network approximation for the component of the high-fidelity coefficient and is the parameter of the network.

We remark that the high-fidelity models are expensive to simulate, hence the number of high-fidelity training data is limited. To avoid over-fitting, we limit ourselves to a two-hidden-layer structure which theoretically can approximate any function [25, 10]. In addition, since the neural network is shallow, there is less concern about the vanishing of gradients. Therefore, we choose to use tanh

as the activation function to make use of its nonlinearity. To train the neural network, we optimize the mean squared loss function by the Levenberg-Marquardt (LM) algorithm, which is suitable to train a shallow network with a small number of connections as suggested in

[12, 15].

Algorithm 3 outlines the detailed steps for the offline stage.

Sample a collection of parameters . Run the low-fidelity model and high-fidelity model for each .Compute the POD coefficients for both fidelities by projection:
Concatenate the parameter with the corresponding low-fidelity coefficient to form the input for the proposed neural network:
For , train the network with as the output independently.
Algorithm 3 Offline Stage for BiFi-NN

3.2.2 Online stage

Once the neural network is trained, we can predict the high-fidelity POD projection coefficients very efficiently. Given a new parameter , one needs to

  • Run the low-fidelity model and compute its low-fidelity POD coefficients via a projection onto the low-fidelity reduced approximation space, i.e.,

    (11)

    and concatenate the learned low-fidelity coefficients with :

    (12)
  • For , evaluate the pre-trained network to approximate the component of the corresponding high-fidelity POD coefficients , and the results are concatenated to obtain the POD coefficient vector, i.e.,

    (13)
  • The resulted BiFi-NN approximation of the high-fidelity solution for the given parameter is given by

    (14)

We emphasize that for a new given parameter , our method only requires one additional low-fidelity run to extract the additional input feature to predict the high-fidelity reduced coefficient during the online stage. Therefore, the online cost mainly depends on the cost of low-fidelity solvers. Since we assume the low-fidelity model is cheap to compute, this cost should be affordable.

Algorithm 4 outlines the details for the online stage.

Run the low-fidelity model for the given . Compute the low-fidelity POD coefficients and concatenate the learned low-fidelity coefficients with :
For , evaluate the trained neural network at the combined feature to the BiFi-NN approximation of the component of the high-fidelity POD coefficient:
and concatenate the results to obtain the coefficient vector . Compute the BiFi-NN approximation of the high-fidelity solution for the given :
Algorithm 4 Online Algorithm of BiFi-NN

3.3 A modified POD-NN Algorithm

From the discussion in the previous section, BiFi-NN trains neural networks to predict the high-fidelity POD coefficient , where the output for the net is the component of the high-fidelity coefficient . In order to demonstrate the effectiveness of the additional input features of BiFi-NN, we proposed to slightly modify the POD-NN structure so that it has the same output layer with the BiFi-NN and the rest of architecture is the same with that of the original POD-NN discussed in Section 3.1. The structure of the net is illustrated in Figure 4. All nets have the same structure for the sake of simplicity. We referred this method as modified POD-NN (MPOD-NN).

Figure 4: Network structure of the net for modified POD-NN method. The input is the parameter vector , and the output is the component of the high-fidelity POD coefficient .

Consequently, the modified POD-NN approximation of the component of the high-fidelity POD coefficient vector can be represented as follows:

(15)

where is the trained neural network approximation for the coefficient and is the parameter of the network.

During the online stage, for each given parameter , we obtain the component of high-fidelity POD coefficient by a forward pass through the pre-trained net. All these predicted results are concatenated into the POD coefficient vector . The corresponding reduced solution of the high-fidelity solution is then given by

(16)

For the same amount of training samples, the predictive accuracy of the modified POD-NN is expected to be comparable with the original POD-NN. In contrast to output nodes in the original POD-NN, each net in modified POD-NN only has a single output node. Therefore each net in the modified POD-NN is is relatively easy to train with the Levenberg-Marquardt algorithm due to the compactness of the network.

3.4 Discussion of Error Contributions

Since the error estimation for neural network approximation is still under development in the literature, a detailed error analysis is difficult to perform. Here, we briefly discuss the error contribution of the approximation for the high-fidelity solution. To estimate the error, it is sufficient to consider the approximation error between the high-fidelity solution and the reduced solution at each parameter point

:

(17)

where is the Euclidean norm in . To get the last inequality, we used the fact that the POD basis is orthonormal. , and are defined as follows:

  • represents the approximation error of the reduced solution , which is measured by the relative error between the reduced solution and the high-fidelity solution:

    (18)
  • represents the projection error of the high-fidelity solution , which is measured by the relative error between the high-fidelity solution and its projection onto the high-fidelity reduced approximation space :

    (19)

    Assume a reasonable good reduced approximation space exists, the projection error can be reduced by increasing the dimension of the reduced approximation space.

  • represents the coefficient error due to the neural network approximation, which is measured by the relative error between the learned POD coefficient vector and the high-fidelity coefficient vector :

    (20)

    We remark that the error depends many factors, such as the optimization algorithms for training, choice of loss functions, choice of the input features and available training points. In this project, we mainly explore the role of the features extracted from the low-fidelity data on the predictive accuracy of the proposed neural network.

On the other hand, the projection error is the smallest distance between and the reduced space , i.e.,

(21)

hence it provides a lower bound for the approximation error . Consequently, we have

(22)

The inequalities (22) gives both upper and lower bound for the approximation error of the reduced solution expressed in terms of the projection error and the coefficient error. This also reveals the approximation error ’s dependence on two major error contributions - the coefficient error committed by neural network approximation and the projection error of high-fidelity solutions on the reduced space. Even though this is a rough analysis of the error contribution, it still provide a general guideline to analyze the error behavior of the proposed method in the next section.

Remark 3.1

By the linearity of the expectation operator and (22), we can also get the error bound of the mean relative approximation error as follows:

(23)

4 Numerical Examples

In this section, we present several numerical examples to illustrate the effectiveness and performance of the proposed method. To measure the accuracy of the approximation, we shall compute the following three types of errors over an independent test of size :

  1. The mean approximation error of the reduced solution , measured by the relative error with respect to the high-fidelity solution :

    (24)

    where is the Euclidean norm in .

  2. The coefficient learning error for the first POD coefficients, measured by the mean relative error

    (25)

    where is the coefficient vector of the first POD coefficients.

  3. The mean relative POD projection error for high-fidelity solution,

    (26)

By the similar procedure in Section 3.4, we can derive a similar result:

(27)

The inequality (27) is a discrete version of (23).

For all examples in the rest of the section, we employ a two-hidden-layer neural network for both modified POD-NN and BiFi-NN. We remark that an additional dataset is used for validation, whose size is 25% of the training set. To find the best network configuration, i.e, the number of hidden units , we choose the best results for the modified POD-NN and BiFi-NN over the number of hidden units (varying from 1 to 24). For simiplicity, we set two hidden layers with the same number of hiddent units . The optimization is carried out by the Levenberg-Marquardt (LM) algorithm as mentioned before. The multiple restarts approach is employed to prevent the results from depending on the way the weights are (randomly) initialized. In other words, for each configuration, we train 10 nets with random initial conditions, and select the network with the smallest validation error.

With loss of generality, we employed the solutions solved on coarse and fine meshes as the low-fidelity and high-fidelity models in our numerical examples due to their availability for most applications. Nevertheless, the method itself has no restriction on both models employed if they model the same physical system.

4.1 1D stochastic elliptic equation

We first consider a 1D elliptic equation with random diffusivity coefficient, a standard benchmark problem in the context of uncertainty quantification as follows:

(28)

with the random diffusivity coefficient given as follows:

(29)

The parameter , where each coordinate of

is a uniformly distributed random variable in

. We fix the dimension and therefore, it is a 10-dimensional problem in the parameter space.

We solve (28) by Chebyshev collocation method (in physical space). We employ Chebyshev collocation points for the high-fidelity model and collocation points for low-fidelity models. In all cases, the models are evaluated on a 100-point uniform mesh in the physical space. The error metrics are evaluated on a test set of 100 Monte Carlo points, and the reduced basis sets are computed with 100 snapshot solutions independent from both training and test sets. In the following tests, we shall train both BiFi-NN and modified POD-NN with the same training sets of different sizes and present the results of each method with the corresponding optimal hidden units () over the same test set.

Figure 5: Left: Convergence results of approximation error by modified POD-NN and BiFi-NN (based on the low-fidelity model (28) with ) with the training sets of different sizes (). Right: Coefficient error with by modified POD-NN and BiFi-NN with respect to the size of training set compared to the projection error. The solid lines are results of BiFi-NN, and the dashed lines represent modified POD-NN.

Figure 5 (left) shows the approximation error of modified POD-NN and BiFi-NN as well as the projection error with respect to the number of the high-fidelity POD basis. Both modified POD-NN and BiFi-NN continue to decrease as the number of POD basis increases and saturate later on. The saturation is because the coefficient error is dominant over the projection error when the dimension of the reduced approximation space is not large enough. When more training data is available, the saturation level can be further deduced by increasing the size of the network shown in Figure 5 (left). Moreover, BiFi-NN can further saturate at a much lower level than modified POD-NN for a fixed training set. This indicates that the additional input features we incorporated in BiFi-NN can effectively improve the predictive capability of the neural network.

To better understand the properties of BiFi-NN, we plot the coefficient error by modified POD-NN and BiFi-NN for a fixed reduced dimension with respect to the size of training set shown in Figure 5 (right). It is clear that the coefficient error begins to dominate when the training set is small. As we increase the number of the training set, both modified POD-NN and BiFi-NN get more accurate results. It is evident that BiFi-NN achieves a smaller coefficient error compared to modified POD-NN, due to the effectiveness of the low-fidelity feature.

We remark that this is a linear problem, therefore traditional projection-based reduced basis method can also produce accurate results via a reasonable number of high-fidelity reduced basis. We present this example only for benchmark purposes to examine the accuracy of the proposed method.

4.2 2D Nonlinear Elliptic equation

We next consider the following parameterized 2D nonlinear elliptic equation [5] to illustrate its performance on nonlinear PDEs :

(30)

where

(31)

with a homogeneous Dirichlet boundary condition. The spatial domain is , The parameters are .

We solve this problem using finite element. For low-fidelity solutions, we use 135 elements, while for high-fidelity model, we use 2960 elements. The training set is sampled by Latin hypercube sampling (LHS) [19], while the test set is generated on 256 uniform grids in the parameter space, and the reduced basis set is generated from 225 uniform grid points in the parameter space.

The approximation error of modified POD-NN and BiFi-NN are plotted in Figure 6 (left). We first observed that both modified POD-NN and BiFi-NN enjoy fast decay with respect to the number of high-fidelity POD basis and resemble the project error. As the number of high-fidelity basis increases, modified POD-NN saturates quickly and while BiFi-NN can continue to decrease and saturates at a lower level, indicating the effectiveness of low-fidelity features. We also investigate the effects of the size of the training set and report the results in Figure 6 (left). It is clear that the predictive accuracy of both modified POD-NN and BiFi-NN can be further improved when more training date is available. Overall, BiFi-NN produced a lower coefficient error.

We next fix the reduced dimension , which results in a small projection error shown in Figure 6 (right). We expect to see the dominance of the coefficient errors due to neural network approximation. This is clearly illustrated in Figure 6 (right), which plots the coefficient errors with respect to the size of the training set. The coefficient errors are much larger than the projection error in Figure 6 (right) confirming that in this case, the largest error contribution stems from the coefficient error. In addition, BiFi-NN has a similar convergence rate with modified POD-NN, but the coefficient error is roughly one order smaller than that of modified POD-NN. This demonstrates the additional low-fidelity features does help improve the predictive capability of the network approximation.

Figure 6: Left: Convergence results of approximation error by modified POD-NN and BiFi-NN for problem (30) with training sets of different sizes (). Right: Coefficient error with by modified POD-NN and BiFi-NN with respect to the size of training set compared to the projection error. The solid lines are results of BiFi-NN, and the dashed lines represent modified POD-NN.
Figure 7: Numerical convergence of approximation error by original POD-NN and modified POD-NN applied to problem (30) with a training set of size . The solid lines are the approximation errors of original POD-NN, and the dashed lines represent modified POD-NN.

We also compare numerical convergence for the original POD-NN and the modified POD-NN based on training points shown in Figure 7. The results suggest that both approaches reach a comparable accuracy for this example. In addition, the hidden unit in each layer of POD-NN is , while the configuration of modified POD-NN is slightly more compact with . These results justifies that the use of modified POD-NN as a baseline solution is reasonable as we mentioned in Section 3.3.

To further demonstrate the accuracy of the BiFi-NN, we show the BiFi-NN and modified POD-NN approximation of the high-fidelity solution for in Figure 8. The first row is the high-fidelity solution and the projection error (based on POD basis), the second row shows the approximation and the corresponding approximation error by a pre-trained modified POD-NN model with training samples, while the last row is those based on BiFi-NN with the same training set. Both modified POD-NN and BiFi-NN show a good agreement with the high-fidelity solutions. However, BiFi-NN offered a better accuracy, particularly around the peak of the solution.

Figure 8: The high-fidelity solution and the projection error (top), the approximation results and corresponding approximation errors by modified POD-NN (middle) and BiFi-NN (bottom) with training data and high-fidelity POD basis for

4.3 2D Vorticity Equation

In the third example, we can consider the following 2D vorticity equation for an incompressible flow [36] with a random viscosity coefficient:

(32)

with the following initial condition:

(33)

where

(34)

and is a random noise uniformly distributed in [-1, 1] to the initial condition, which is fixed among all the training and test samples. The viscosity is varied in the range of , and the spatial variables .

Fourier spectral method is employed to solve this problem until the final time with a time step . The high-fidelity model is solved on a uniform grid of size in the spatial domain, while the low-fidelity model is solved on a coarser uniform mesh with a size of . Training samples and 100 test samples are drawn independently by LHS. The reduced basis set is generated over an independent set of 100 sample points drawn by LHS.

Figure 9 (left) illustrates the approximation error convergence of both modified POD-NN and BiFi-NN methods with the number of the high-fidelity POD basis retained. Again, fast error decay for both methods is observed when the number of POD basis is small. When the reduced dimension is large enough, it is evident that BiFi-NN delivers better results over modified POD-NN for a fixed train set. This signified the effectiveness of additional low-fidelity features we incorporate in BiFi-NN framework.

Figure 9 (right) presents the coefficient errors with respect to the size of the training set, when the dimension of the reduced approximation space is fixed at . The coefficient errors are roughly 10 times larger than the projection error, indicating that the dominant error contribution is due to the coefficient error committed by the neural network approximation. It is clear that by utilizing the information from the low-fidelity model, BiFi-NN is able to improve the accuracy of approximation of the high-fidelity reduced coefficients and offer more accurate reduced solutions.

We also compare numerical convergence for the original POD-NN and the modified POD-NN based on training points shown in Figure 10. In this example, the modified POD-NN produced better results as the reduced dimension is large enough for this example. This might be because the network configuration of the modified POD-NN () is more compact with that of the original POD-NN (). Therefore, it is easier to train with the same training set.

Figure 9: Left: Numerical convergence of approximation error by modified POD-NN and BiFi-NN applied to problem (32) with training sets of different sizes (). Right: Coefficient error with by modified POD-NN and BiFi-NN with respect to the size of training set compared to the projection error. The solid lines are results of BiFi-NN, and the dashed lines represent modified POD-NN.
Figure 10: Numerical convergence of approximation error by original POD-NN and modified POD-NN applied to problem (32) with a training set of size . The solid lines are the approximation errors of original POD-NN, and the dashed lines represent modified POD-NN.

4.4 2D flow around cylinder

In the last example, we consider a 2D channel flow around a cylinder [21] with a random inflow condition, which is modeled by the following Naiver-Stokes equations:

(35)

where and are the sought velocity and pressure, and is a given body force [21]. The fluid has viscosity and unit density. The problem is defined on a channel , with a cylinder of diameter centered at . The boundary is divided into two parts and , where denotes either the rigid walls of the channel with , or the inflow region with the inflow velocity profile, and denotes the outlet. On both upper and lower wall and on the cylinder, a non-slip boundary condition is prescribed. On the right wall, zero initial conditions are assumed. A random inflow profile with is given on the left wall:

(36)

with

(37)

where subjects to a random uniform distribution. Here we fix and (where the positivity of is guaranteed).

Figure 11: The geometry for the 2D flow past cylinder problem (35).

We solve the above equation with finite elements. Similar to the previous example, we consider the following two models: low-fidelity solutions use 119 elements and high-fidelity solutions use 2522 elements. The geometry and mesh for the low-fidelity model is illustrated in Figure 11. The time step is set to be and the final time is . Our output of interest is the magnitude of the flow field. In this problem, 100 Monte Carlo samples in the parameter space are employed as the test set and an independent set of 100 Monte Carlo samples is utilized to compute the reduced basis set.

The approximation errors of both modified POD-NN and BiFi-NN based on three different training sets () are plotted in Figure 12 (left). We observed a fast decay for both methods and they both stagnate roughly around (10) high-fidelity POD basis, indicating the coefficient error begins to dominate over the projection error. The convergence behaviors of modified POD-NN and BiFi-NN are similar. However, BiFi-NN can continue to decrease and saturates at a lower error level for a fixed training set. When more training data is available, the error saturation level can be further reduced shown in Figure 12 (left).

The coefficient errors of both methods with respect to the number of training points, for a fixed reduced dimension is further analyzed in Figure 12 (right). In this case, the coefficient error is dominant over the project error, particularly for the small training set. When the size of training data increases, the coefficient errors of both methods can be further improved. It is evident that compared with modified POD-NN, the improvement on the approximation of the high-fidelity coefficients is quite noticeable by incorporating the low-fidelity feature on BiFi-NN.

Figure 12: Left: Convergence analysis of approximation error by modified POD-NN and BiFi-NN for problem (35) with training sets of different sizes (). Right: Coefficient error with by Modified POD-NN and BiFi-NN with respect to the size of training set compared to the projection error. The solid lines are results of BiFi-NN, and the dashed lines are of modified POD-NN.

5 Summary

In this paper, we proposed a new nonintrusive reduced-order modeling method (referred to as BiFi-NN). With both low-fidelity data and high-fidelity data, the method generates the reduced basis set from the collection of high-fidelity snapshots by POD, and employs two-hidden-layer perceptron to approximate the high-fidelity POD coefficients with both low-fidelity POD coefficients and the physical model parameters as input features. With an affordable computational cost, we demonstrated the improved predictive performance of the proposed method and the effectiveness of the additional features extracted from low-fidelity models via several benchmark examples, particularly for nonlinear problems. Future work includes evaluating the framework on more complex problems and extending this idea to general multi-fidelity case, i.e., the number of fidelities is larger than three.

6 Acknowledgement

We thanks for the Simons Foundation (504054) for their funding support.

References

  • [1] N. Alexandrov, R. Lewis, C. Gumbert, L. Green, and P. Newman. Optimization with variable-fidelity models applied to wing design. In 38th Aerospace Sciences Meeting and Exhibit, page 841, 2000.
  • [2] D. Amsallem. Interpolation on manifolds of CFD-based fluid and finite element-based structural reduced-order models for on-line aeroelastic predictions. PhD thesis, Stanford University, 2010.
  • [3] A. Buffa, Y. Maday, A. T. Patera, C. Prud’homme, and G. Turinici. A priori convergence of the greedy algorithm for the parametrized reduced basis method. ESAIM: Mathematical modelling and numerical analysis, 46(3):595–603, 2012.
  • [4] A. Chatterjee. An introduction to the proper orthogonal decomposition. Current science, pages 808–817, 2000.
  • [5] S. Chaturantabut and D. C. Sorensen. Nonlinear model reduction via discrete empirical interpolation. SIAM Journal on Scientific Computing, 32(5):2737–2764, 2010.
  • [6] P. Chen and C. Schwab. Model order reduction methods in computational uncertainty quantification. Handbook of Uncertainty Quantification, pages 1–53, 2016.
  • [7] Y. Chen, J. S. Hesthaven, Y. Maday, J. Rodríguez, and X. Zhu. Certified reduced basis method for electromagnetic scattering and radar cross section estimation. Computer Methods in Applied Mechanics and Engineering, 233:92–108, 2012.
  • [8] K. Cutajar, S. Antipolis, F. M. Pullin, A. Damianou, N. Lawrence, and J. González. Deep gaussian processes for multi-fidelity modeling.
  • [9] M. Cutler, T. J. Walsh, and J. P. How. Reinforcement learning with multi-fidelity simulators. In Robotics and Automation (ICRA), 2014 IEEE International Conference on, pages 3888–3895. IEEE, 2014.
  • [10] G. Cybenko.

    Approximation by superpositions of a sigmoidal function.

    Mathematics of control, signals and systems, 2(4):303–314, 1989.
  • [11] T. G. Dietterich. Ensemble methods in machine learning. In

    International workshop on multiple classifier systems

    , pages 1–15. Springer, 2000.
  • [12] M. T. Hagan and M. B. Menhaj. Training feedforward networks with the marquardt algorithm. IEEE transactions on Neural Networks, 5(6):989–993, 1994.
  • [13] J. Hampton, H. R. Fairbanks, A. Narayan, and A. Doostan. Practical error bounds for a non-intrusive bi-fidelity approach to parametric/stochastic model reduction. Journal of Computational Physics, 368:315–332, 2018.
  • [14] J. S. Hesthaven, G. Rozza, B. Stamm, et al. Certified reduced basis methods for parametrized partial differential equations. Springer, 2016.
  • [15] J. S. Hesthaven and S. Ubbiali. Non-intrusive reduced order modeling of nonlinear problems using neural networks. Journal of Computational Physics, 363:55–78, 2018.
  • [16] J. S. Hesthaven, S. Zhang, and X. Zhu. Reduced basis multiscale finite element methods for elliptic problems. Multiscale Modeling & Simulation, 13(1):316–337, 2015.
  • [17] K. Hornik. Approximation capabilities of multilayer feedforward networks. Neural networks, 4(2):251–257, 1991.
  • [18] Y.-Q. Hu, Y. Yu, W.-W. Tu, Q. Yang, Y. Chen, and W. Dai. Multi-fidelity automatic hyper-parameter tuning via transfer series expansion. 2019.
  • [19] R. L. Iman. L atin hypercube sampling. Wiley StatsRef: Statistics Reference Online, 2014.
  • [20] K. Kandasamy, G. Dasarathy, J. B. Oliva, J. Schneider, and B. Póczos. Gaussian process bandit optimisation with multi-fidelity evaluations. In Advances in Neural Information Processing Systems, pages 992–1000, 2016.
  • [21] M. G. Larson and F. Bengzon. The finite element method: theory, implementation, and applications, volume 10. Springer Science & Business Media, 2013.
  • [22] L. Leifsson and S. Koziel. Multi-fidelity design optimization of transonic airfoils using physics-based surrogate modeling and shape-preserving response prediction. Journal of Computational Science, 1(2):98–106, 2010.
  • [23] Y. Maday. Reduced basis method for the rapid and reliable solution of partial differential equations. 2006.
  • [24] A. Narayan, C. Gittelson, and D. Xiu. A stochastic collocation algorithm with multifidelity models. SIAM Journal on Scientific Computing, 36(2):A495–A521, 2014.
  • [25] U. of Illinois at Urbana-Champaign. Center for Supercomputing Research, Development, and G. Cybenko. Continuous valued neural networks with two hidden layers are sufficient. 1988.
  • [26] B. Peherstorfer, T. Cui, Y. Marzouk, and K. Willcox. Multifidelity importance sampling. Computer Methods in Applied Mechanics and Engineering, 300:490–509, 2016.
  • [27] P. Perdikaris, D. Venturi, J. O. Royset, and G. E. Karniadakis. Multi-fidelity modelling via recursive co-kriging and gaussian–markov random fields. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 471(2179):20150018, 2015.
  • [28] A. Quarteroni, A. Manzoni, and F. Negri. Reduced basis methods for partial differential equations: an introduction, volume 92. Springer, 2015.
  • [29] T. Robinson, M. Eldred, K. Willcox, and R. Haimes. Surrogate-based optimization using multifidelity models with variable parameterization and corrected space mapping. Aiaa Journal, 46(11):2814–2822, 2008.
  • [30] The Royal Society. Multi-fidelity optimization via surrogate modelling, volume 463, 2007.
  • [31] G. Rozza, D. B. P. Huynh, and A. T. Patera. Reduced basis approximation and a posteriori error estimation for affinely parametrized elliptic coercive partial differential equations. Archives of Computational Methods in Engineering, 15(3):1, 2007.
  • [32] S. Ruder. An overview of gradient descent optimization algorithms. CoRR, abs/1609.04747, 2016.
  • [33] R. J. Schalkoff. Artificial neural networks, volume 1. McGraw-Hill New York, 1997.
  • [34] R. Sen, K. Kandasamy, and S. Shakkottai. Multi-fidelity black-box optimization with hierarchical partitions. In J. Dy and A. Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 4538–4547, Stockholmsmässan, Stockholm Sweden, 10–15 Jul 2018. PMLR.
  • [35] G. Sun, G. Li, S. Zhou, W. Xu, X. Yang, and Q. Li. Multi-fidelity optimization for sheet metal forming process. Structural and Multidisciplinary Optimization, 44(1):111–124, 2011.
  • [36] M. Suzuki. Fourier-Spectral Methods For Navier Stokes equations in 2D. http://www.math.mcgill.ca/gantumur/math595f14/NSMashbat.pdf, 2014.
  • [37] D. Xiao et al. Non-intrusive reduced order models and their applications. PhD thesis, PhD thesis, Imperial College London, 2016.
  • [38] D. Xiao, F. Fang, A. Buchan, C. Pain, I. Navon, and A. Muggeridge. Non-intrusive reduced order modelling of the navier–stokes equations. Computer Methods in Applied Mechanics and Engineering, 293:522–541, 2015.
  • [39] X. Yang, X. Zhu, and J. Li. When bifidelity meets cokriging: An efficient physics-informed multifidelity method. arXiv preprint arXiv:1812.02919, 2018.
  • [40] Z.-H. Zhou. Ensemble methods: foundations and algorithms. Chapman and Hall/CRC, 2012.
  • [41] X. Zhu, E. M. Linebarger, and D. Xiu.

    Multi-fidelity stochastic collocation method for computation of statistical moments.

    Journal of Computational Physics, 341:386–396, 2017.
  • [42] X. Zhu, A. Narayan, and D. Xiu. Computational aspects of stochastic collocation with multifidelity models. SIAM/ASA Journal on Uncertainty Quantification, 2(1):444–463, 2014.