DeepAI
Log In Sign Up

Multi-fidelity wavelet neural operator with application to uncertainty quantification

08/11/2022
by   Akshay Thakur, et al.
18

Operator learning frameworks, because of their ability to learn nonlinear maps between two infinite dimensional functional spaces and utilization of neural networks in doing so, have recently emerged as one of the more pertinent areas in the field of applied machine learning. Although these frameworks are extremely capable when it comes to modeling complex phenomena, they require an extensive amount of data for successful training which is often not available or is too expensive. However, this issue can be alleviated with the use of multi-fidelity learning, where a model is trained by making use of a large amount of inexpensive low-fidelity data along with a small amount of expensive high-fidelity data. To this end, we develop a new framework based on the wavelet neural operator which is capable of learning from a multi-fidelity dataset. The developed model's excellent learning capabilities are demonstrated by solving different problems which require effective correlation learning between the two fidelities for surrogate construction. Furthermore, we also assess the application of the developed framework for uncertainty quantification. The results obtained from this work illustrate the excellent performance of the proposed framework.

READ FULL TEXT VIEW PDF

page 5

page 8

page 9

page 11

page 13

page 14

12/02/2021

Multi-fidelity methods for uncertainty propagation in kinetic equations

The construction of efficient methods for uncertainty quantification in ...
04/19/2022

Multifidelity Deep Operator Networks

Operator learning for complex nonlinear operators is increasingly common...
04/15/2021

Bi-fidelity Reduced Polynomial Chaos Expansion for Uncertainty Quantification

A ubiquitous challenge in design space exploration or uncertainty quanti...
02/24/2017

A Stochastic Operator Approach to Model Inadequacy with Applications to Contaminant Transport

The mathematical models used to represent physical phenomena are general...
10/15/2019

A neural network approach for uncertainty quantification for time-dependent problems with random parameters

In this work we propose a numerical framework for uncertainty quantifica...
09/13/2022

Quadrature Sampling of Parametric Models with Bi-fidelity Boosting

Least squares regression is a ubiquitous tool for building emulators (a....

1 Introduction

The advancements in computational methods and scientific machine learning in recent times have not only helped in the accurate modeling of complex physical systems but have also enabled expeditious developments in the application of multi-fidelity modeling methods to these systems [1, 2, 3, 4, 5, 6]. In general, a considerable amount of high fidelity (HF) data is required to train an accurate single fidelity surrogate model (SM) for complex physical phenomena. However, the cost of obtaining sufficient HF data is often exorbitant, and often, the availability of HF data is limited; for example, generally, limited data is available from some experiments due to their prolonged nature and costly setup. Furthermore, such limited data availability hinders a complete comprehension of the system when tackling problems involving uncertainty quantification [7, 8, 9] and reliability analysis [10, 11, 12]. On the other hand, in general, low-fidelity data is easily and cheaply available in abundance. However, training an SM using low fidelity (LF) data leads to sizeable inaccuracies.

To alleviate these issues, a potential solution lies in the utilization of multi-fidelity learning techniques which involve the integration of HF and LF data. Furthermore, several different multi-fidelity learning approaches, which could be employed in a wide variety of situations, exist in the literature. Perhaps some of the noteworthy approaches are response surface models [13], Gaussian process regression [14, 15, 16], polynomial chaos expansions [17, 18]

, and deep learning

[19, 3]

. Especially, a significant amount has been done on the usage of neural networks for multi-fidelity learning recently. This includes utilisation of transfer learning based approaches

[20, 21, 22, 23], approaches which concurrently train LF and HF neural networks (NNs) [19, 3], and approaches training the different NNs in a successive fashion [24]. Furthermore, all of these approaches could be coupled with physics-informed learning [20, 19, 24]. Apart from that, it should also be noted that the LF and the HF data can be obtained from a combination of different sources; for example, coarse mesh and fine mesh numerical simulations, numerical simulations and experiments, or numerical simulations and analytical solutions, to name a few.

One of the more immediate developments in the field of scientific machine learning has been the introduction of a new framework, called neural operators, for learning the mapping of operators between two infinite dimensional spaces by pushing global integral operators, in a manner akin to NNs, through activation functions which are nonlinear and local

[25, 26, 27, 28, 29, 30]. A more archetypical usage of neural operators, because of their discretization invariance, has been in the construction of SMs for PDE solvers. However, again, training these neural operator frameworks for accurate outputs requires a large amount of high fidelity data, making the process computationally expensive. Nevertheless, this computational burden could be relieved by using multi-fidelity data for training. More recently, different studies have considered multi-fidelity data for training DeepONets [31, 32, 33]. However, no such effort has been made for other classes of neural operators.

Tripura and Chakraborty [28] found that the Wavelet Neural Operator (WNO), which they proposed, managed to outperform other neural operator frameworks in the majority of the complex nonlinear operator learning tasks assessed by them. Furthermore, WNO is highly effective in learning patterns in images or signals, tackling both complex and smooth geometries, handling families of particularly nonlinear families of PDEs, and learning the frequency with which changes take place in a pattern within a solution domain. Therefore, for increasing the learning efficiency and exploiting the different available fidelities of data, in the present work, we propose a methodology for enabling multi-fidelity learning with Wavelet Neural Operator (WNO) from small HF and large LF datasets by using input supplementation and residual-based learning.

The remainder of the paper is structured as follows: section 2 provides an overview of WNO and multi-fidelity learning. Furthermore, it describes the methodologies employed to develop a framework for WNO, which enables it to learn from multi-fidelity data. In section 3, the developed framework is put to test on several numerical problems, and the results indicating the performance of the proposed framework are presented. Finally, in section 4, the concluding remarks are provided.

2 Methodology

We begin with a brief overview of WNO and multi-fidelity learning. The brief overviews are then followed by a delineation of the multi-fidelity scheme employed for WNO in the current work.

2.1 Wavelet Neural Operator

As mentioned in section 1, neural operators can learn the operator mappings between two infinite-dimensional function spaces. In general, for a given PDE, the mappings are learned from functions in the input function space such as source term, initial condition, or boundary condition to the PDE solution function in the output function space.

For proper comprehension, let us consider an -dimensional fixed domain with boundary . Let, and be the input and output functions in corresponding Banach spaces and . Then the nonlinear differential operator to be learned is given by,

(1)

Considering we have datasets available in the form of point discretized pairs of input and output functions , the operator could be approximated using a NN as follows:

(2)

where denotes NN’s finite-dimensional parameter space. The desired neural network architecture for operator learning is achieved by firstly lifting the inputs to some high dimensional space by usage of a local transformation given by . The transformation in the current case is achieved with the help of a shallow fully connected NN (FNN). In addition, the high dimensional space is denoted as and it only takes values in . Furthermore, the number of total updates or steps required for attaining acceptable convergence is denoted by . Thus, the updates applied on , can be represented as, where the function again only takes value in . After the updates, another local transformation denoted by is employed to transform the output in the high dimensional to the solution space . Also, the definition of the step-wise updates can be provided as follows:

(3)

where the non-linear activation function is denoted by

, the linear transformation is represented using

, and the integral operator on is denoted as . Furthermore, as our framework is based on neural networks, so is represented as a kernel integral operator parameterized by . Thus, here is the convolution operator. Additionally, with the employment of the degenerate kernel concept, Eq. (3) allows us to learn the mapping between any two infinite dimensional spaces. Finally, for obtaining the WNO framework, the learning of the kernel is performed by parameterizing the kernel in the wavelet domain. Furthermore, a notably essential component of WNO is the wavelet kernel integral layer, a pictorial description of which can be found in Fig. 1.

2.2 Multi-fidelity modeling

The major theme in multi-fidelity learning is the exploitation of correlation between the HF data, which is quite accurate but available in a smaller amount, and the LF data, which is inaccurate but is available in a larger amount. A popular autoregressive strategy for multi-fidelity learning [34] can be expressed as follows:

(4)

where the LF and HF data are respectively denoted by and , is a multiplicative factor that determines the correlation between the two fidelities, and quantifies the corresponding additive correlation. However, the issue with this strategy is that it only captures the linear correlation between the LF and the HF data. In order to account for the nonlinear correlation, Meng and Karniadakis [19] proposed a generic autoregressive scheme, which is as follows:

(5)

where is a function, linear or nonlinear and not known a priori, that defines the mapping between the two fidelities. Further, Eq. (5) can also be expressed as,

(6)

2.3 Proposed Framework

In the current work, we aim to approximate operator in Eq. (6) using the WNO framework. To accomplish this, a small HF dataset, , containing number of HF input and output function pairs is generated along with a large LF dataset, , containing pairs of LF input and output function. Again, the reason for small and large sizes is computational cost. Mathematically, the datasets can be represented as follows:

(7)

where the HF operator we want to learn is denoted by , the LF operator is denoted using , and .

Furthermore, to enable WNO for multi-fidelity learning, we follow a two-step approach as shown in Fig. 1. Firstly, we train a WNO network (LF-WNO) using LF data. Of course, because of the vast supply of LF data, training a low-fidelity surrogate model (LFSM) to a high degree of accuracy is a straightforward task. However, in the second step where we train a separate WNO network (HF-WNO) to learn the HF solution , additional strategies such as supplementation of WNO inputs with LF outputs and residual operator learning have to be used. These approaches are described in the sections below.

2.3.1 Supplementing HF-WNO inputs with low-fidelity solution

As established in the previous section, the main goal of multi-fidelity learning is uncovering and exploiting the correlation between LF solution and HF solution . In order to enable the network to effectively learn the unknown operator which maps and to , in a data-driven fashion, we augment the inputs to the HF-WNO network with .

2.3.2 Learning the residual operator

Instead of training HF-WNO to directly learn the HF operator , in residual-based learning, the focus is on learning the residual operator. Residual is nothing but the difference between HF and LF solution, and mathematically, this could be expressed as,

(8)

where is the residual operator. Essentially, as a strong correlation exists between HF and LF solution, a feature similitude, although not exact, is expected between them. Simply, the LF solution, despite being inaccurate, preserves the rudimentary feature structure of the HF solution. Therefore, it is a comparatively easier task to learn the residual rather than the HF solution itself.

2.3.3 Complete Framework

To finally arrive at the complete framework for multi-fidelity WNO, we combine the two-step SM approach with input supplementation and residual operator learning. The pictorial representation of the complete framework is provided in Fig. 1. In practice, if we have access to an efficient LF-solver, we can directly replace the LF-WNO block with an LF-solver block. In absence of an efficient LF-solver, the LF-WNO is first trained using the dataset . As the size of the low-fidelity dataset, () is very large, it is possible to train LF-WNO with a high level of accuracy. In the second step, we train the HF-WNO by making use of both and . As shown in figure, an additional input, the LF solution , is supplied to the network along with . Using a shallow FNN, these inputs are lifted to a higher dimension. The high dimensional output from FNN is then passed through several wavelet kernel integral layers. In the kernel integral layer, to obtain the wavelet coefficients, the inputs firstly undergo multilevel wavelet decomposition . The wavelet coefficients in the sub-band of the last level are then convoluted with the neural network weights, and this is followed by an inverse wavelet transform on the convoluted output from the previous step in order to bring the dimensions back to lifted inputs. Simultaneously, in parallel to the kernel integral layer a local linear transform

is applied to the lifted inputs. The output from the local linear transform is then added to that from the kernel integral layer and a suitable activation function is applied. After repetition of the same process through the rest of the wavelet kernel integral layers, the output is passed through another FNN which transforms them back to the intended target output, which is the residual

. Finally, the LF solution is added back to the residual to obtain the desired high fidelity solution .

Furthermore, given a suitable loss function

is selected, the training of the network can be represented as the following minimization problem:

(9)
Figure 1: The proposed framework for multi-fidelity wavelet neural operator (MF-WNO).

3 Numerical Implementation and Results

In this section, we analyze the performance of the proposed approach on five different example problems. The first two are artificial benchmark problems, the third problem is a stochastic ordinary differential equation, the fourth problem is a stochastic partial differential equation (SPDE) on an irregular domain, and the final problem is again an SPDE, but we also demonstrate the uncertainty quantification capabilities of the proposed framework on this problem. In addition, for evaluating the accuracy of the proposed approach, we use three metrics for error quantification, specifically, mean-squared error, absolute error, and coefficient of determination (

-score). Firstly, the evaluation of mean-squared error is done as,

(10)

where is the solution predicted by the multi-fidelity surrogate model (MFSM) and is the high-fidelity solution. Secondly, the -score is computed as follows:

(11)

3.1 Problem Set I: Artificial Benchmarks

3.1.1 Artificial Benchmark I: 1-dimensional problem, correlation with input

To demonstrate the multi-fidelity learning abilities of the propounded framework, we consider a problem where the correlation between the HF and LF solution is contingent upon the input function, :

(12)

where and . Furthermore, we can represent the equation for HF solution as follows:

(13)

Also, we have presented the actual LF and HF solutions in Fig. 2. Furthermore, an MFSM is trained using HF-WNO on a multi-fidelity dataset (where LF output solution is concatenated with the HF inputs) of different sizes, which, in this case, are , , , and . Also, we train a high fidelity surrogate model (HFSM) using vanilla WNO on HF datasets of similar sizes for comparison. In addition, we provide MSE on test datasets for models trained using different training dataset sizes along with the MSE between HF and LF solution, or in Table 1. Apart from that, we also present the absolute error plots between exact HF solution and predicted solution from HFSM and MFSM along with absolute errors between exact HF solution and LF solution in Fig. 2.

Figure 2: Absolute error between exact HF solution and prediction from MFSM and HFSM on unseen test set along with absolute error between exact HF and LF solution for training dataset sizes of (a) , (b) , (c) , and (d) training samples. The first column shows the absolute error between the exact HF solution and predictions from MFSM and HFSM, and the second column shows the absolute error between LF and the exact solution HF solution. The y-axis has a logarithmic scale. (e) Plots for the exact HF solution and LF solution for 1- problem having a correlation with .
Size MSE
MFSM HFSM
8.8132 2.2164
8.9857 3.5422
5.9807 1.9693
4.8607 3.4449
3.4780
Table 1: MSE error between exact HF solution and predictions from MFSM and HFSM on unseen test dataset for different training dataset sizes.

Clearly, as evident from Table 1, the MFSM, while predicting on the test dataset, is able to outperform the HFSM by two orders of magnitude, whereas it also provides a significant improvement over the LF output. Furthermore, Fig. 2 shows the consistently outstanding performance of MFSM against HFSM on a single test case for different training dataset sizes. Therefore, it can be confidently stated the MFSM does an excellent job of capturing the desired correlation.

3.1.2 Artificial Benchmark II: 2-dimensional problem with a nonlinear correlation

As the next benchmark problem, we subject our network to the learning of complicated nonlinear correlation between HF and LF solutions given by the following equations,

(14)

where and . Again, we perform MFSM and HFSM construction for this case as well. Also, different models are trained on datasets with sizes , , , and , respectively, and the results are presented in Fig. 3. In addition, the exact HF solution and LF solution are presented in Fig. 3.1.2. Further, the MSE for MFSM and HFSM predictions along with are presented in Table 2

Figure 3: Absolute error between exact HF solution and prediction from MFSM and HFSM on unseen test set along with absolute error between exact HF and LF solution for training dataset sizes of (a) , (b) , (c) , and (d) . The first column shows the absolute error between exact HF solution and MFSM prediction, the second column shows the absolute error between exact HF solution and MFSM prediction, and the third column is the absolute error between LF and exact HF solution.

figureExact HF solution and LF solution for the 2-dimensional problem with nonlinear correlation.

It can be seen from Table 2, that error for MFSM predictions is 3 orders of magnitude lesser than HFSM predictions for training dataset sizes and , while for sizes and is two orders of magnitude lesser. Furthermore, the results in Fig. 3 demonstrate the excellent performance of MFSM with respect to HFSM for all training dataset sizes. These results have expressed that our model has been successful in discovering the complex nonlinear correlation between the LF and HF solution, and has shown the capability of prediction with high orders of accuracy.

Size MSE
MFSM HFSM
1.2043 1.7359
1.3134 7.7837
3.5792 3.3641
3.3927 2.3504
3.2892
Table 2: MSE error between exact HF solution and predictions from MFSM and HFSM on unseen test dataset for different training dataset sizes.

3.2 Problem Set II: Stochastic Poisson’s equation and Darcy flow in an irregular domain

3.2.1 Stochastic Poisson’s equation

The following ordinary differential equation (ODE) is known as the 1-dimensional Poisson’s equation,

(15)

where is the spatially varying forcing term. In the current example, we model the forcing term as Gaussian random field (GRF) as,

(16)

where the mean is zero and the covariance function is modeled using a Gaussian kernel as, . Furthermore, the parameter for lengthscale . The goal of this problem is to learn the operator mapping from the stochastic forcing term to the solution of the ODE:

(17)

An instance of which has been randomly sampled is provided in Fig. 4. The solver for Poisson’s equation is based on the finite difference method. The HF and LF dataset, in this case, is generated using a finer and coarser grid. The mesh size for the fine grid is , while for the coarse grid, it is . The finite difference solver used for data generation is based on the code provided by Lu et al. [32]. For this example as well, we train different MFSMs and HFSMs using training datasets with sizes of , , , and . Furthermore, we also present the MSE value for MFSM and HFSM predictions for all training dataset sizes in Table 3.

Size MSE
MFSM HFSM
3.2329 2.5602
5.1759 6.3312
7.8373 1.8202
1.4672 1.3396
2.0642
Table 3: MSE error between exact HF solution and predictions from MFSM and HFSM on unseen test dataset for different training dataset sizes.

It is clear from Table 3, that MSE values for MFSM test predictions are two orders lesser in magnitude as compared to HFSM’s prediction for training dataset sizes of and , while a decrease in error by one order of magnitude is found for sizes and . Furthermore, in general, the MSE value for MFSM prediction is better than the output from the LF solver by an order of magnitude. Also, the excellent performance for a single case test case can be assessed from Fig. 4.

Figure 4: (a) A randomly sampled instance of the stochastic forcing term . (b) Comparison of exact HF solution with the solution predicted by MFSM, HFSM, and low-fidelity model (LFM)(or solver, in the current case) for a single test instance for models trained with dataset size .

3.3 2-dimensional Darcy flow in a triangular domain with a notch

For this example, we consider a 2-dimensional nonlinear elliptic Darcy flow PDE given as,

(18)

where is the permeability field, is the pressure, and is the source term. The Darcy flow equation has been widely used to model the flow through porous media, and it finds substantial applications in fields such as geotechnical, civil, and petroleum engineering. For the current example, a triangular domain with a rectangular notch is considered. The boundary conditions for this problem are generated using Gaussian process, . Furthermore, the covariance kernel is modeled as follows:

(19)

Here, the kernel length scale parameter is set equal . Furthermore, and . Our objective with this problem is to learn the operator mapping from boundary conditions to the pressure field in the whole domain. This could be mathematically expressed as,

(20)

The training data is generated using MATLAB PDE Toolbox and by modification of the code provided by Lu et al. [30]. The LF and HF datasets are generated using coarse and finer grids. A pictorial representation of the coarse and fine grids is provided in Fig. 5. Furthermore, HFSMs and MFSMs are trained using datasets with sizes , , , and . The MSE values for models trained using different dataset sizes are provided in Table 4.

Figure 5: Comparison of grids used for obtaining LF and HF dataset. (a) Fine grid (b) Coarse grid
Figure 6: Comparison of exact HF solution with the solution predicted by MFSM and predictions along with respective absolute errors and pictorial representation of boundary conditions for a single test instance for models trained with dataset size .
Size MSE
MFSM HFSM
7.8612 4.7026
1.2727 8.4175
2.7316 1.9218
5.6001 6.2372
2.6884
Table 4: MSE error between exact HF solution and predictions from MFSM and HFSM on unseen test dataset for different training dataset sizes.

Table 4 reveals that predictions made by MFSM for all training dataset sizes, in general, have an MSE value that is a couple of orders of magnitude lesser than the predictions from HFSM. It is also revealed by Table 4 that the MFSMs provides an improvement in error values by two orders of magnitude for training dataset size and one order improvement for the rest of the dataset sizes. Furthermore, the outstanding prediction capabilities are made indisputable by the pictorial representation of the comparison of MFSM and HFSM predictions with exact HF solution or the ground truth in Fig. 6.

3.4 Problem Set III: 2-dimensional stochastic steady state heat equation with application to uncertainty quantification

As our final problem, we consider the following 2-dimensional elliptic PDE given as,

(21)

where is the diffusion coefficient. The equation is defined on a unit square domain with and is subjected to the following boundary conditions:

(22)

Furthermore, to capture the associated uncertainty, the diffusion coefficient is modeled as a log-normal random field, which can be expressed as follows:

(23)

where the mean function , and the covariance function is modeled as a GRF. The covariance kernel can be represented as follows:

(24)

The covariance kernel’s parameter and are set equal to . Furthermore, the dimensionality of the current problem is . Our aim with this problem is two folds. Firstly, we want to learn the operator mapping from the logarithm of the diffusion coefficient to the solution of the PDE using multi-fidelity data, which could be mathematically written as follows:

(25)

Secondly, we want to use the surrogate constructed using the multi-fidelity learning approach for solving an uncertainty propagation (UP) problem. Furthermore, to generate the LF and HF data, a finite volume (FV) solver provided by Tripathy and Bilionis [35] is modified, and the stochastic steady state equation is solved on grids of sizes and for LF and HF cases, respectively.

3.4.1 Surrogate construction using MF-WNO

We train surrogate models (or MFSMs and HFSMs, respectively) using dataset sizes of , , , and using both multi-fidelity data along with MF-WNO and only HF data with vanilla WNO. In order to visually compare the outputs from MFSM and HFSM for different training dataset sizes, a pictorial presentation of the predicted solutions, errors, actual HF solutions or ground truth, and associated input field, i.e., the logarithm of diffusion coefficient, is provided in Fig. 7. Furthermore, for all the different dataset sizes, the MSE values for the prediction of MFSM and HFSM on unseen test datasets are made available in Table 5.

Size MSE
MFSM HFSM
1.4001 3.0188
2.7293 5.5890
4.5502 1.3692
7.2992 2.5784
1.3793
Table 5: MSE error between exact HF solution and predictions from MFSM and HFSM on unseen test dataset for different training dataset sizes.

After looking at Table 5, it becomes obvious that MFSM outperforms HFSM for all training dataset sizes. Furthermore, it can be noticed that there are two orders of magnitude decrease in MSE for MFSM as compared to HFSM for training dataset sizes and , while there is a decrease in error by an order of magnitude for sizes and . Moreover, Fig. 7 clearly shows the doubtless superiority of MFSM when compared to HFSM for all sizes of datasets used for model training purposes.

Figure 7: Comparison of exact HF solution with the solution predicted by MFSM and predictions along with respective absolute errors and pictorial representation of logarithm of diffusion coefficient for training dataset sizes of (a) , (b) , (c) , and (d) .

3.4.2 Solving uncertainty propagation problem with MF-WNO

After successfully constructing an SM using multi-fidelity learning, we move on to our second goal of the application of the developed framework for the solution of a UP problem. To this end, we draw

samples from an input distribution where the lengthscales are similar to the ones used for training the different models. Here, the aim is to estimate the following statistics for the predictions obtained using MFSM and HFSM:

  • Mean of MFSM and HFSM predictions.

  • Variance of MFSM and HFSM predictions.

  • Probability density of the solutions predicted by MFSM and HFSM at and .

Figure 8: Mean and variance comparison of predicted PDE solutions obtained using the MFSM and HFSM with exact HF solution (a) , (b) , (c) , and (d) training samples. The first row shows the mean, while the second row depicts the variance.

Furthermore, we compare the obtained mean and variance with the ground truth, i.e, actual PDE solution, for models trained with dataset sizes , , , and . The the results for mean and variance are presented in Fig. 8 Finally, the -score and the MSE for each case is presented in Table 6 and Table 7.

Size MFSM HFSM
MSE -score MSE -score
1.6431 0.9998 1.2397 0.9984
2.7280 0.9996 3.4613 0.9956
4.1129 0.9995 1.8670 0.9767
7.6062 0.9990 5.1008 0.9365
Table 6: MSE and -score for mean of PDE solution predicted by MFSM and HFSM
Size MFSM HFSM
MSE -score MSE -score
5.2626 0.9950 4.8505 0.5439
9.0885 0.9914 6.6926 0.3707
2.7682 0.9739 2.1539 -1.0251
3.1166 0.9706 2.2706 -1.1348
Table 7: MSE and -score for variance of PDE solution predicted by MFSM and HFSM

Fig. 8 and Tables 6 and 7 reveal that MFSM consistently captures the mean and variance to high degree of accuracy with -score for mean and variance greater than and

, respectively, for all models trained on dataset with different sizes. Furthermore, it is also clear that the MFSM captures the mean and variance in a more accurate fashion as compared to HFSM. It should be also noted that although the disparity in capturing the mean is not very large for MFSM and HFSM, but the performance of MFSM in capturing the variance is extremely better when compared to HFSM. Moreover, we also compare the probability density function (PDF) of the solution predicted by MFSM and HFSM with the PDF of exact HF solution obtained from the FV- solver at two spatial grid points, in this case,

and . The results are presented in Fig. 9, and they indicate a considerably superior match between the pdf of the predicted solution from MFSM with the PDF of the exact HF solution when compared to the predicted solution from HFSM.

Figure 9: Comparison of PDE solution density at spatial grid points (a) and (b) between MFSM predictions, HFSM predictions, and the exact HF solution for models trained with dataset size .

3.4.3 Comparison of LF-WNO and LF solver for the supply of low-fidelity data

It is possible that although LF data is present, we might not have access to the LF solver. In such cases, it would be challenging to generate new LF solutions that might be required for predictions of HF solutions using HF-WNO for unavailable input functions. However, in these cases, we could always turn towards training a surrogate using LF-WNO and the available data. Therefore, to establish that LF-WNO can easily replace an LF solver, in this section, we compare the utilization of a low fidelity surrogate model (LFSM), trained using LF-WNO on a training dataset of size , with the LF-solver as the source for low-fidelity solution data, which would then be used for residual learning and input supplementation in HF-WNO. To evaluate the accuracy of LFSM, firstly, the MSE between LFSM predictions and actual LF-solution for a given test set. The said MSE is found to be . Furthermore, we evaluate the MSE between the LF-solution predicted from LFSM for an unseen test set and the corresponding exact HF solution. We draw a comparison by performing a similar evaluation for corresponding LF-solutions from LF-solver. The MSE obtained for LFSM predictions and solutions from LF-solver with respect to exact HF-solution are and , respectively. Clearly, even for a training dataset size of , the MSE values on the test set indicate that the surrogate predicted and actual LF solutions very closely match each other. The success of LFSM in accurately mimicking the LF solution is further illustrated by the pdf plots at points and for LFSM and actual solution from the LF-solver in Fig. 10.

Figure 10: Comparison of PDE solution density at spatial grid points (a) and (b) between LFSM predictions and the actual solution from LF-FV solver.

Furthermore, we also develop an MFSM on a training dataset of size similar to section 3.4.1. However, unlike the MFSM in section 3.4.1, where the LF solver was directly used for the supply of LF solution to HF-WNO, here we use the LF solution predictions from LF-WNO as inputs to HF-WNO. The resulting MSE obtained for MFSM on a test set is compared to MSE value of for MFSM in section 3.4.1. This again indicates that due to the ability of LFSM to become highly accurate with sufficient training data, the result obtained from MFSM trained using LF data from LFSM is similar to that from MFSM trained using LF data directly from the LF solver. It should be noted that the accuracy of LFSM would keep on increasing with a further increase in training dataset size. Finally, as another comparison, we perform the UP study with MFSM developed using LF-WNO in a fashion similar to section 3.4.2. For the purpose of contrast in solving the UP problem, we present the pdf obtained using LF-WNO based MFSM, LF solver based MFSM, and actual HF-FV solver in Fig. 11 for two spatial points and .

Figure 11: Comparison of PDE solution density at spatial grid points (a) and (b) among the LF-WNO based MFSM’s predictions, LF solver based MFSM’s predictions, and the actual solution from HF-FV solver.

In view of all the results presented in this section, it can be safely stated that a LF-WNO can accurately and efficiently replace LF solver.

4 Conclusions

In this work, we proposed a novel framework for WNO which allows accurate and efficient learning from a multi-fidelity dataset. The novel framework was developed by coupling two separate WNO networks together with the help of residual operator learning and supplementation of input. The proposed approach was based upon the utilization of inexpensive to generate LF dataset of large size together with expensive to generate HF dataset of small size.

The performance of the proposed framework was assessed on several problems including artificial benchmarks and complex PDEs, and the results consistently showed that the surrogate model constructed using MF-WNO provides predictions that are at least one order of magnitude and, in many cases, multiple orders of magnitude more accurate than both the LF solution and predictions made by vanilla WNO with only HF dataset. In addition, the propounded approach does extremely well in uncovering the correlations between the LF and HF data, even when they are nonlinear and complex. Furthermore, the framework successfully tackles problems with an irregular domain. Also, from the results obtained from the stochastic heat equation problem, it is observed that our framework is robust and accurate for surrogate modeling for high-dimensional PDEs in the low data limit. Moreover, the versatility of the proposed model was evaluated by using the developed surrogates for uncertainty quantification, and the obtained results were strongly representative of the model’s success in completing the intended objective. Even more noteworthy observations were the facts that the higher accuracy was consistent for all the MFSMs trained using different dataset sizes, and the predicted probability density functions at different spatial locations provided an almost exact match with the pdf obtained from the HF solution despite the framework being frequentist and trained on a very small amount of HF data. Furthermore, a possible direction for future work could be an extension of the proposed framework by incorporating physics.

Acknowledgements

T. Tripura acknowledges the financial support received from the Ministry of Human Resource Development (MHRD), India in form of the Prime Minister’s Research Fellows (PMRF) scholarship. S. Chakraborty acknowledges the financial support received from Science and Engineering Research Board (SERB) via grant no. SRG/2021/000467 and seed grant received from IIT Delhi.

Data availability

On acceptance all the used datasets in this study will be made public on GitHub by the corresponding author.

Code availability

On acceptance all the source codes to reproduce the results in this study will be made available to public on GitHub by the corresponding author.

References