1 Introduction
The advancements in computational methods and scientific machine learning in recent times have not only helped in the accurate modeling of complex physical systems but have also enabled expeditious developments in the application of multifidelity modeling methods to these systems [1, 2, 3, 4, 5, 6]. In general, a considerable amount of high fidelity (HF) data is required to train an accurate single fidelity surrogate model (SM) for complex physical phenomena. However, the cost of obtaining sufficient HF data is often exorbitant, and often, the availability of HF data is limited; for example, generally, limited data is available from some experiments due to their prolonged nature and costly setup. Furthermore, such limited data availability hinders a complete comprehension of the system when tackling problems involving uncertainty quantification [7, 8, 9] and reliability analysis [10, 11, 12]. On the other hand, in general, lowfidelity data is easily and cheaply available in abundance. However, training an SM using low fidelity (LF) data leads to sizeable inaccuracies.
To alleviate these issues, a potential solution lies in the utilization of multifidelity learning techniques which involve the integration of HF and LF data. Furthermore, several different multifidelity learning approaches, which could be employed in a wide variety of situations, exist in the literature. Perhaps some of the noteworthy approaches are response surface models [13], Gaussian process regression [14, 15, 16], polynomial chaos expansions [17, 18]
, and deep learning
[19, 3]. Especially, a significant amount has been done on the usage of neural networks for multifidelity learning recently. This includes utilisation of transfer learning based approaches
[20, 21, 22, 23], approaches which concurrently train LF and HF neural networks (NNs) [19, 3], and approaches training the different NNs in a successive fashion [24]. Furthermore, all of these approaches could be coupled with physicsinformed learning [20, 19, 24]. Apart from that, it should also be noted that the LF and the HF data can be obtained from a combination of different sources; for example, coarse mesh and fine mesh numerical simulations, numerical simulations and experiments, or numerical simulations and analytical solutions, to name a few.One of the more immediate developments in the field of scientific machine learning has been the introduction of a new framework, called neural operators, for learning the mapping of operators between two infinite dimensional spaces by pushing global integral operators, in a manner akin to NNs, through activation functions which are nonlinear and local
[25, 26, 27, 28, 29, 30]. A more archetypical usage of neural operators, because of their discretization invariance, has been in the construction of SMs for PDE solvers. However, again, training these neural operator frameworks for accurate outputs requires a large amount of high fidelity data, making the process computationally expensive. Nevertheless, this computational burden could be relieved by using multifidelity data for training. More recently, different studies have considered multifidelity data for training DeepONets [31, 32, 33]. However, no such effort has been made for other classes of neural operators.Tripura and Chakraborty [28] found that the Wavelet Neural Operator (WNO), which they proposed, managed to outperform other neural operator frameworks in the majority of the complex nonlinear operator learning tasks assessed by them. Furthermore, WNO is highly effective in learning patterns in images or signals, tackling both complex and smooth geometries, handling families of particularly nonlinear families of PDEs, and learning the frequency with which changes take place in a pattern within a solution domain. Therefore, for increasing the learning efficiency and exploiting the different available fidelities of data, in the present work, we propose a methodology for enabling multifidelity learning with Wavelet Neural Operator (WNO) from small HF and large LF datasets by using input supplementation and residualbased learning.
The remainder of the paper is structured as follows: section 2 provides an overview of WNO and multifidelity learning. Furthermore, it describes the methodologies employed to develop a framework for WNO, which enables it to learn from multifidelity data. In section 3, the developed framework is put to test on several numerical problems, and the results indicating the performance of the proposed framework are presented. Finally, in section 4, the concluding remarks are provided.
2 Methodology
We begin with a brief overview of WNO and multifidelity learning. The brief overviews are then followed by a delineation of the multifidelity scheme employed for WNO in the current work.
2.1 Wavelet Neural Operator
As mentioned in section 1, neural operators can learn the operator mappings between two infinitedimensional function spaces. In general, for a given PDE, the mappings are learned from functions in the input function space such as source term, initial condition, or boundary condition to the PDE solution function in the output function space.
For proper comprehension, let us consider an dimensional fixed domain with boundary . Let, and be the input and output functions in corresponding Banach spaces and . Then the nonlinear differential operator to be learned is given by,
(1) 
Considering we have datasets available in the form of point discretized pairs of input and output functions , the operator could be approximated using a NN as follows:
(2) 
where denotes NN’s finitedimensional parameter space. The desired neural network architecture for operator learning is achieved by firstly lifting the inputs to some high dimensional space by usage of a local transformation given by . The transformation in the current case is achieved with the help of a shallow fully connected NN (FNN). In addition, the high dimensional space is denoted as and it only takes values in . Furthermore, the number of total updates or steps required for attaining acceptable convergence is denoted by . Thus, the updates applied on , can be represented as, where the function again only takes value in . After the updates, another local transformation denoted by is employed to transform the output in the high dimensional to the solution space . Also, the definition of the stepwise updates can be provided as follows:
(3) 
where the nonlinear activation function is denoted by
, the linear transformation is represented using
, and the integral operator on is denoted as . Furthermore, as our framework is based on neural networks, so is represented as a kernel integral operator parameterized by . Thus, here is the convolution operator. Additionally, with the employment of the degenerate kernel concept, Eq. (3) allows us to learn the mapping between any two infinite dimensional spaces. Finally, for obtaining the WNO framework, the learning of the kernel is performed by parameterizing the kernel in the wavelet domain. Furthermore, a notably essential component of WNO is the wavelet kernel integral layer, a pictorial description of which can be found in Fig. 1.2.2 Multifidelity modeling
The major theme in multifidelity learning is the exploitation of correlation between the HF data, which is quite accurate but available in a smaller amount, and the LF data, which is inaccurate but is available in a larger amount. A popular autoregressive strategy for multifidelity learning [34] can be expressed as follows:
(4) 
where the LF and HF data are respectively denoted by and , is a multiplicative factor that determines the correlation between the two fidelities, and quantifies the corresponding additive correlation. However, the issue with this strategy is that it only captures the linear correlation between the LF and the HF data. In order to account for the nonlinear correlation, Meng and Karniadakis [19] proposed a generic autoregressive scheme, which is as follows:
(5) 
where is a function, linear or nonlinear and not known a priori, that defines the mapping between the two fidelities. Further, Eq. (5) can also be expressed as,
(6) 
2.3 Proposed Framework
In the current work, we aim to approximate operator in Eq. (6) using the WNO framework. To accomplish this, a small HF dataset, , containing number of HF input and output function pairs is generated along with a large LF dataset, , containing pairs of LF input and output function. Again, the reason for small and large sizes is computational cost. Mathematically, the datasets can be represented as follows:
(7) 
where the HF operator we want to learn is denoted by , the LF operator is denoted using , and .
Furthermore, to enable WNO for multifidelity learning, we follow a twostep approach as shown in Fig. 1. Firstly, we train a WNO network (LFWNO) using LF data. Of course, because of the vast supply of LF data, training a lowfidelity surrogate model (LFSM) to a high degree of accuracy is a straightforward task. However, in the second step where we train a separate WNO network (HFWNO) to learn the HF solution , additional strategies such as supplementation of WNO inputs with LF outputs and residual operator learning have to be used. These approaches are described in the sections below.
2.3.1 Supplementing HFWNO inputs with lowfidelity solution
As established in the previous section, the main goal of multifidelity learning is uncovering and exploiting the correlation between LF solution and HF solution . In order to enable the network to effectively learn the unknown operator which maps and to , in a datadriven fashion, we augment the inputs to the HFWNO network with .
2.3.2 Learning the residual operator
Instead of training HFWNO to directly learn the HF operator , in residualbased learning, the focus is on learning the residual operator. Residual is nothing but the difference between HF and LF solution, and mathematically, this could be expressed as,
(8) 
where is the residual operator. Essentially, as a strong correlation exists between HF and LF solution, a feature similitude, although not exact, is expected between them. Simply, the LF solution, despite being inaccurate, preserves the rudimentary feature structure of the HF solution. Therefore, it is a comparatively easier task to learn the residual rather than the HF solution itself.
2.3.3 Complete Framework
To finally arrive at the complete framework for multifidelity WNO, we combine the twostep SM approach with input supplementation and residual operator learning. The pictorial representation of the complete framework is provided in Fig. 1. In practice, if we have access to an efficient LFsolver, we can directly replace the LFWNO block with an LFsolver block. In absence of an efficient LFsolver, the LFWNO is first trained using the dataset . As the size of the lowfidelity dataset, () is very large, it is possible to train LFWNO with a high level of accuracy. In the second step, we train the HFWNO by making use of both and . As shown in figure, an additional input, the LF solution , is supplied to the network along with . Using a shallow FNN, these inputs are lifted to a higher dimension. The high dimensional output from FNN is then passed through several wavelet kernel integral layers. In the kernel integral layer, to obtain the wavelet coefficients, the inputs firstly undergo multilevel wavelet decomposition . The wavelet coefficients in the subband of the last level are then convoluted with the neural network weights, and this is followed by an inverse wavelet transform on the convoluted output from the previous step in order to bring the dimensions back to lifted inputs. Simultaneously, in parallel to the kernel integral layer a local linear transform
is applied to the lifted inputs. The output from the local linear transform is then added to that from the kernel integral layer and a suitable activation function is applied. After repetition of the same process through the rest of the wavelet kernel integral layers, the output is passed through another FNN which transforms them back to the intended target output, which is the residual
. Finally, the LF solution is added back to the residual to obtain the desired high fidelity solution .Furthermore, given a suitable loss function
is selected, the training of the network can be represented as the following minimization problem:(9) 
3 Numerical Implementation and Results
In this section, we analyze the performance of the proposed approach on five different example problems. The first two are artificial benchmark problems, the third problem is a stochastic ordinary differential equation, the fourth problem is a stochastic partial differential equation (SPDE) on an irregular domain, and the final problem is again an SPDE, but we also demonstrate the uncertainty quantification capabilities of the proposed framework on this problem. In addition, for evaluating the accuracy of the proposed approach, we use three metrics for error quantification, specifically, meansquared error, absolute error, and coefficient of determination (
score). Firstly, the evaluation of meansquared error is done as,(10) 
where is the solution predicted by the multifidelity surrogate model (MFSM) and is the highfidelity solution. Secondly, the score is computed as follows:
(11) 
3.1 Problem Set I: Artificial Benchmarks
3.1.1 Artificial Benchmark I: 1dimensional problem, correlation with input
To demonstrate the multifidelity learning abilities of the propounded framework, we consider a problem where the correlation between the HF and LF solution is contingent upon the input function, :
(12)  
where and . Furthermore, we can represent the equation for HF solution as follows:
(13)  
Also, we have presented the actual LF and HF solutions in Fig. 2. Furthermore, an MFSM is trained using HFWNO on a multifidelity dataset (where LF output solution is concatenated with the HF inputs) of different sizes, which, in this case, are , , , and . Also, we train a high fidelity surrogate model (HFSM) using vanilla WNO on HF datasets of similar sizes for comparison. In addition, we provide MSE on test datasets for models trained using different training dataset sizes along with the MSE between HF and LF solution, or in Table 1. Apart from that, we also present the absolute error plots between exact HF solution and predicted solution from HFSM and MFSM along with absolute errors between exact HF solution and LF solution in Fig. 2.
Size  MSE  

MFSM  HFSM  
8.8132  2.2164  
8.9857  3.5422  
5.9807  1.9693  
4.8607  3.4449  
3.4780 
Clearly, as evident from Table 1, the MFSM, while predicting on the test dataset, is able to outperform the HFSM by two orders of magnitude, whereas it also provides a significant improvement over the LF output. Furthermore, Fig. 2 shows the consistently outstanding performance of MFSM against HFSM on a single test case for different training dataset sizes. Therefore, it can be confidently stated the MFSM does an excellent job of capturing the desired correlation.
3.1.2 Artificial Benchmark II: 2dimensional problem with a nonlinear correlation
As the next benchmark problem, we subject our network to the learning of complicated nonlinear correlation between HF and LF solutions given by the following equations,
(14)  
where and . Again, we perform MFSM and HFSM construction for this case as well. Also, different models are trained on datasets with sizes , , , and , respectively, and the results are presented in Fig. 3. In addition, the exact HF solution and LF solution are presented in Fig. 3.1.2. Further, the MSE for MFSM and HFSM predictions along with are presented in Table 2
It can be seen from Table 2, that error for MFSM predictions is 3 orders of magnitude lesser than HFSM predictions for training dataset sizes and , while for sizes and is two orders of magnitude lesser. Furthermore, the results in Fig. 3 demonstrate the excellent performance of MFSM with respect to HFSM for all training dataset sizes. These results have expressed that our model has been successful in discovering the complex nonlinear correlation between the LF and HF solution, and has shown the capability of prediction with high orders of accuracy.
Size  MSE  

MFSM  HFSM  
1.2043  1.7359  
1.3134  7.7837  
3.5792  3.3641  
3.3927  2.3504  
3.2892 
3.2 Problem Set II: Stochastic Poisson’s equation and Darcy flow in an irregular domain
3.2.1 Stochastic Poisson’s equation
The following ordinary differential equation (ODE) is known as the 1dimensional Poisson’s equation,
(15) 
where is the spatially varying forcing term. In the current example, we model the forcing term as Gaussian random field (GRF) as,
(16) 
where the mean is zero and the covariance function is modeled using a Gaussian kernel as, . Furthermore, the parameter for lengthscale . The goal of this problem is to learn the operator mapping from the stochastic forcing term to the solution of the ODE:
(17) 
An instance of which has been randomly sampled is provided in Fig. 4. The solver for Poisson’s equation is based on the finite difference method. The HF and LF dataset, in this case, is generated using a finer and coarser grid. The mesh size for the fine grid is , while for the coarse grid, it is . The finite difference solver used for data generation is based on the code provided by Lu et al. [32]. For this example as well, we train different MFSMs and HFSMs using training datasets with sizes of , , , and . Furthermore, we also present the MSE value for MFSM and HFSM predictions for all training dataset sizes in Table 3.
Size  MSE  

MFSM  HFSM  
3.2329  2.5602  
5.1759  6.3312  
7.8373  1.8202  
1.4672  1.3396  
2.0642 
It is clear from Table 3, that MSE values for MFSM test predictions are two orders lesser in magnitude as compared to HFSM’s prediction for training dataset sizes of and , while a decrease in error by one order of magnitude is found for sizes and . Furthermore, in general, the MSE value for MFSM prediction is better than the output from the LF solver by an order of magnitude. Also, the excellent performance for a single case test case can be assessed from Fig. 4.
3.3 2dimensional Darcy flow in a triangular domain with a notch
For this example, we consider a 2dimensional nonlinear elliptic Darcy flow PDE given as,
(18) 
where is the permeability field, is the pressure, and is the source term. The Darcy flow equation has been widely used to model the flow through porous media, and it finds substantial applications in fields such as geotechnical, civil, and petroleum engineering. For the current example, a triangular domain with a rectangular notch is considered. The boundary conditions for this problem are generated using Gaussian process, . Furthermore, the covariance kernel is modeled as follows:
(19) 
Here, the kernel length scale parameter is set equal . Furthermore, and . Our objective with this problem is to learn the operator mapping from boundary conditions to the pressure field in the whole domain. This could be mathematically expressed as,
(20) 
The training data is generated using MATLAB PDE Toolbox and by modification of the code provided by Lu et al. [30]. The LF and HF datasets are generated using coarse and finer grids. A pictorial representation of the coarse and fine grids is provided in Fig. 5. Furthermore, HFSMs and MFSMs are trained using datasets with sizes , , , and . The MSE values for models trained using different dataset sizes are provided in Table 4.
Size  MSE  

MFSM  HFSM  
7.8612  4.7026  
1.2727  8.4175  
2.7316  1.9218  
5.6001  6.2372  
2.6884 
Table 4 reveals that predictions made by MFSM for all training dataset sizes, in general, have an MSE value that is a couple of orders of magnitude lesser than the predictions from HFSM. It is also revealed by Table 4 that the MFSMs provides an improvement in error values by two orders of magnitude for training dataset size and one order improvement for the rest of the dataset sizes. Furthermore, the outstanding prediction capabilities are made indisputable by the pictorial representation of the comparison of MFSM and HFSM predictions with exact HF solution or the ground truth in Fig. 6.
3.4 Problem Set III: 2dimensional stochastic steady state heat equation with application to uncertainty quantification
As our final problem, we consider the following 2dimensional elliptic PDE given as,
(21) 
where is the diffusion coefficient. The equation is defined on a unit square domain with and is subjected to the following boundary conditions:
(22) 
Furthermore, to capture the associated uncertainty, the diffusion coefficient is modeled as a lognormal random field, which can be expressed as follows:
(23) 
where the mean function , and the covariance function is modeled as a GRF. The covariance kernel can be represented as follows:
(24) 
The covariance kernel’s parameter and are set equal to . Furthermore, the dimensionality of the current problem is . Our aim with this problem is two folds. Firstly, we want to learn the operator mapping from the logarithm of the diffusion coefficient to the solution of the PDE using multifidelity data, which could be mathematically written as follows:
(25) 
Secondly, we want to use the surrogate constructed using the multifidelity learning approach for solving an uncertainty propagation (UP) problem. Furthermore, to generate the LF and HF data, a finite volume (FV) solver provided by Tripathy and Bilionis [35] is modified, and the stochastic steady state equation is solved on grids of sizes and for LF and HF cases, respectively.
3.4.1 Surrogate construction using MFWNO
We train surrogate models (or MFSMs and HFSMs, respectively) using dataset sizes of , , , and using both multifidelity data along with MFWNO and only HF data with vanilla WNO. In order to visually compare the outputs from MFSM and HFSM for different training dataset sizes, a pictorial presentation of the predicted solutions, errors, actual HF solutions or ground truth, and associated input field, i.e., the logarithm of diffusion coefficient, is provided in Fig. 7. Furthermore, for all the different dataset sizes, the MSE values for the prediction of MFSM and HFSM on unseen test datasets are made available in Table 5.
Size  MSE  

MFSM  HFSM  
1.4001  3.0188  
2.7293  5.5890  
4.5502  1.3692  
7.2992  2.5784  
1.3793 
After looking at Table 5, it becomes obvious that MFSM outperforms HFSM for all training dataset sizes. Furthermore, it can be noticed that there are two orders of magnitude decrease in MSE for MFSM as compared to HFSM for training dataset sizes and , while there is a decrease in error by an order of magnitude for sizes and . Moreover, Fig. 7 clearly shows the doubtless superiority of MFSM when compared to HFSM for all sizes of datasets used for model training purposes.
3.4.2 Solving uncertainty propagation problem with MFWNO
After successfully constructing an SM using multifidelity learning, we move on to our second goal of the application of the developed framework for the solution of a UP problem. To this end, we draw
samples from an input distribution where the lengthscales are similar to the ones used for training the different models. Here, the aim is to estimate the following statistics for the predictions obtained using MFSM and HFSM:

Mean of MFSM and HFSM predictions.

Variance of MFSM and HFSM predictions.

Probability density of the solutions predicted by MFSM and HFSM at and .
Furthermore, we compare the obtained mean and variance with the ground truth, i.e, actual PDE solution, for models trained with dataset sizes , , , and . The the results for mean and variance are presented in Fig. 8 Finally, the score and the MSE for each case is presented in Table 6 and Table 7.
Size  MFSM  HFSM  

MSE  score  MSE  score  
1.6431  0.9998  1.2397  0.9984  
2.7280  0.9996  3.4613  0.9956  
4.1129  0.9995  1.8670  0.9767  
7.6062  0.9990  5.1008  0.9365 
Size  MFSM  HFSM  

MSE  score  MSE  score  
5.2626  0.9950  4.8505  0.5439  
9.0885  0.9914  6.6926  0.3707  
2.7682  0.9739  2.1539  1.0251  
3.1166  0.9706  2.2706  1.1348 
Fig. 8 and Tables 6 and 7 reveal that MFSM consistently captures the mean and variance to high degree of accuracy with score for mean and variance greater than and
, respectively, for all models trained on dataset with different sizes. Furthermore, it is also clear that the MFSM captures the mean and variance in a more accurate fashion as compared to HFSM. It should be also noted that although the disparity in capturing the mean is not very large for MFSM and HFSM, but the performance of MFSM in capturing the variance is extremely better when compared to HFSM. Moreover, we also compare the probability density function (PDF) of the solution predicted by MFSM and HFSM with the PDF of exact HF solution obtained from the FV solver at two spatial grid points, in this case,
and . The results are presented in Fig. 9, and they indicate a considerably superior match between the pdf of the predicted solution from MFSM with the PDF of the exact HF solution when compared to the predicted solution from HFSM.3.4.3 Comparison of LFWNO and LF solver for the supply of lowfidelity data
It is possible that although LF data is present, we might not have access to the LF solver. In such cases, it would be challenging to generate new LF solutions that might be required for predictions of HF solutions using HFWNO for unavailable input functions. However, in these cases, we could always turn towards training a surrogate using LFWNO and the available data. Therefore, to establish that LFWNO can easily replace an LF solver, in this section, we compare the utilization of a low fidelity surrogate model (LFSM), trained using LFWNO on a training dataset of size , with the LFsolver as the source for lowfidelity solution data, which would then be used for residual learning and input supplementation in HFWNO. To evaluate the accuracy of LFSM, firstly, the MSE between LFSM predictions and actual LFsolution for a given test set. The said MSE is found to be . Furthermore, we evaluate the MSE between the LFsolution predicted from LFSM for an unseen test set and the corresponding exact HF solution. We draw a comparison by performing a similar evaluation for corresponding LFsolutions from LFsolver. The MSE obtained for LFSM predictions and solutions from LFsolver with respect to exact HFsolution are and , respectively. Clearly, even for a training dataset size of , the MSE values on the test set indicate that the surrogate predicted and actual LF solutions very closely match each other. The success of LFSM in accurately mimicking the LF solution is further illustrated by the pdf plots at points and for LFSM and actual solution from the LFsolver in Fig. 10.
Furthermore, we also develop an MFSM on a training dataset of size similar to section 3.4.1. However, unlike the MFSM in section 3.4.1, where the LF solver was directly used for the supply of LF solution to HFWNO, here we use the LF solution predictions from LFWNO as inputs to HFWNO. The resulting MSE obtained for MFSM on a test set is compared to MSE value of for MFSM in section 3.4.1. This again indicates that due to the ability of LFSM to become highly accurate with sufficient training data, the result obtained from MFSM trained using LF data from LFSM is similar to that from MFSM trained using LF data directly from the LF solver. It should be noted that the accuracy of LFSM would keep on increasing with a further increase in training dataset size. Finally, as another comparison, we perform the UP study with MFSM developed using LFWNO in a fashion similar to section 3.4.2. For the purpose of contrast in solving the UP problem, we present the pdf obtained using LFWNO based MFSM, LF solver based MFSM, and actual HFFV solver in Fig. 11 for two spatial points and .
In view of all the results presented in this section, it can be safely stated that a LFWNO can accurately and efficiently replace LF solver.
4 Conclusions
In this work, we proposed a novel framework for WNO which allows accurate and efficient learning from a multifidelity dataset. The novel framework was developed by coupling two separate WNO networks together with the help of residual operator learning and supplementation of input. The proposed approach was based upon the utilization of inexpensive to generate LF dataset of large size together with expensive to generate HF dataset of small size.
The performance of the proposed framework was assessed on several problems including artificial benchmarks and complex PDEs, and the results consistently showed that the surrogate model constructed using MFWNO provides predictions that are at least one order of magnitude and, in many cases, multiple orders of magnitude more accurate than both the LF solution and predictions made by vanilla WNO with only HF dataset. In addition, the propounded approach does extremely well in uncovering the correlations between the LF and HF data, even when they are nonlinear and complex. Furthermore, the framework successfully tackles problems with an irregular domain. Also, from the results obtained from the stochastic heat equation problem, it is observed that our framework is robust and accurate for surrogate modeling for highdimensional PDEs in the low data limit. Moreover, the versatility of the proposed model was evaluated by using the developed surrogates for uncertainty quantification, and the obtained results were strongly representative of the model’s success in completing the intended objective. Even more noteworthy observations were the facts that the higher accuracy was consistent for all the MFSMs trained using different dataset sizes, and the predicted probability density functions at different spatial locations provided an almost exact match with the pdf obtained from the HF solution despite the framework being frequentist and trained on a very small amount of HF data. Furthermore, a possible direction for future work could be an extension of the proposed framework by incorporating physics.
Acknowledgements
T. Tripura acknowledges the financial support received from the Ministry of Human Resource Development (MHRD), India in form of the Prime Minister’s Research Fellows (PMRF) scholarship. S. Chakraborty acknowledges the financial support received from Science and Engineering Research Board (SERB) via grant no. SRG/2021/000467 and seed grant received from IIT Delhi.
Data availability
On acceptance all the used datasets in this study will be made public on GitHub by the corresponding author.
Code availability
On acceptance all the source codes to reproduce the results in this study will be made available to public on GitHub by the corresponding author.
References
 [1] H Babaee, C Bastidas, M DeFilippo, C Chryssostomidis, and GE Karniadakis. A multifidelity framework and uncertainty quantification for sea surface temperature in the massachusetts and cape cod bays. Earth and Space Science, 7(2):e2019EA000954, 2020.
 [2] Yuichi Kuya, Kenji Takeda, Xin Zhang, and Alexander IJ Forrester. Multifidelity surrogate modeling of experimental and computational aerodynamic data sets. AIAA journal, 49(2):289–298, 2011.
 [3] Xinshuai Zhang, Fangfang Xie, Tingwei Ji, Zaoxu Zhu, and Yao Zheng. Multifidelity deep neural network surrogate model for aerodynamic shape optimization. Computer Methods in Applied Mechanics and Engineering, 373:113485, 2021.
 [4] Mohammadamin Mahmoudabadbozchelou, Marco Caggioni, Setareh Shahsavari, William H Hartt, George Em Karniadakis, and Safa Jamali. Datadriven physicsinformed constitutive metamodeling of complex fluids: A multifidelity neural network (mfnn) framework. Journal of Rheology, 65(2):179–198, 2021.
 [5] Casey M Fleeter, Gianluca Geraci, Daniele E Schiavazzi, Andrew M Kahn, and Alison L Marsden. Multilevel and multifidelity uncertainty quantification for cardiovascular hemodynamics. Computer methods in applied mechanics and engineering, 365:113030, 2020.
 [6] Souvik Chakraborty, Tanmoy Chatterjee, Rajib Chowdhury, and Sondipon Adhikari. A surrogate based multifidelity approach for robust design optimization. Applied Mathematical Modelling, 47:726–744, 2017.
 [7] Souvik Chakraborty and Rajib Chowdhury. An efficient algorithm for building locally refined hp–adaptive hpcfe: application to uncertainty quantification. Journal of Computational Physics, 351:59–79, 2017.
 [8] Sandip Dey, Souvik Chakraborty, and Solomon Tesfamariam. Multifidelity approach for uncertainty quantification of buried pipeline response undergoing fault rupture displacements in sand. Computers and Geotechnics, 136:104197, 2021.
 [9] Akshay Thakur and Souvik Chakraborty. Deep capsule encoderdecoder network for surrogate modeling and uncertainty quantification. arXiv preprint arXiv:2201.07753, 2022.
 [10] Souvik Chakraborty and Rajib Chowdhury. Corrigendum to “a semianalytical framework for structural reliability analysis”[comput. methods appl. mech. engrg. 289 (2015) 475–497]. Computer Methods in Applied Mechanics and Engineering, 298:576, 2016.
 [11] Souvik Chakraborty and Dipaloke Majumder. Hybrid reliability analysis framework for reliability analysis of tunnels. Journal of Computing in Civil Engineering, 32(4):04018018, 2018.
 [12] Somdatta Goswami, Souvik Chakraborty, Rajib Chowdhury, and Timon Rabczuk. Threshold shift method for reliabilitybased design optimization. Structural and Multidisciplinary Optimization, 60(5):2053–2072, 2019.
 [13] Roberto Vitali, Raphael T Haftka, and Bhavani V Sankar. Multifidelity design of stiffened composite panel with a crack. Structural and Multidisciplinary Optimization, 23(5):347–356, 2002.
 [14] Alexander IJ Forrester, András Sóbester, and Andy J Keane. Multifidelity optimization via surrogate modelling. Proceedings of the royal society a: mathematical, physical and engineering sciences, 463(2088):3251–3269, 2007.
 [15] Paris Perdikaris, Maziar Raissi, Andreas Damianou, Neil D Lawrence, and George Em Karniadakis. Nonlinear information fusion algorithms for dataefficient multifidelity modelling. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 473(2198):20160751, 2017.
 [16] Maziar Raissi, Paris Perdikaris, and George Em Karniadakis. Inferring solutions of differential equations using noisy multifidelity data. Journal of Computational Physics, 335:736–746, 2017.
 [17] Andres S Padron, Juan J Alonso, and Michael S Eldred. Multifidelity methods in aerodynamic robust optimization. In 18th AIAA nondeterministic approaches conference, page 0680, 2016.

[18]
Liang Yan and Tao Zhou.
Adaptive multifidelity polynomial chaos approach to bayesian inference in inverse problems.
Journal of Computational Physics, 381:110–128, 2019.  [19] Xuhui Meng and George Em Karniadakis. A composite neural network that learns from multifidelity data: Application to function approximation and inverse pde problems. Journal of Computational Physics, 401:109020, 2020.
 [20] Souvik Chakraborty. Transfer learning based multifidelity physics informed deep neural network. Journal of Computational Physics, 426:109942, 2021.
 [21] Subhayan De, Jolene Britton, Matthew Reynolds, Ryan Skinner, Kenneth Jansen, and Alireza Doostan. On transfer learning of neural networks using bifidelity data for uncertainty propagation. International Journal for Uncertainty Quantification, 10(6), 2020.
 [22] Subhayan De and Alireza Doostan. Neural network training using l1regularization and bifidelity data. Journal of Computational Physics, 458:111010, 2022.
 [23] Dong H Song and Daniel M Tartakovsky. Transfer learning on multifidelity data. arXiv preprint arXiv:2105.00856, 2021.
 [24] Dehao Liu and Yan Wang. Multifidelity physicsconstrained neural network and its application in materials modeling. Journal of Mechanical Design, 141(12), 2019.
 [25] Lu Lu, Pengzhan Jin, and George Em Karniadakis. Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193, 2019.
 [26] Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895, 2020.
 [27] Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Neural operator: Graph kernel network for partial differential equations. arXiv preprint arXiv:2003.03485, 2020.
 [28] Tapas Tripura and Souvik Chakraborty. Wavelet neural operator: a neural operator for parametric partial differential equations. arXiv preprint arXiv:2205.02191, 2022.
 [29] Huaiqian You, Yue Yu, Marta D’Elia, Tian Gao, and Stewart Silling. Nonlocal kernel network (nkn): a stable and resolutionindependent deep neural network. arXiv preprint arXiv:2201.02217, 2022.
 [30] Lu Lu, Xuhui Meng, Shengze Cai, Zhiping Mao, Somdatta Goswami, Zhongqiang Zhang, and George Em Karniadakis. A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Computer Methods in Applied Mechanics and Engineering, 393:114778, 2022.
 [31] Subhayan De, Malik Hassanaly, Matthew Reynolds, Ryan N King, and Alireza Doostan. Bifidelity modeling of uncertain and partially unknown systems using deeponets. arXiv preprint arXiv:2204.00997, 2022.
 [32] Lu Lu, Raphaël Pestourie, Steven G Johnson, and Giuseppe Romano. Multifidelity deep neural operators for efficient learning of partial differential equations with application to fast inverse design of nanoscale heat transport. arXiv preprint arXiv:2204.06684, 2022.
 [33] Amanda A Howard, Mauro Perego, George E Karniadakis, and Panos Stinis. Multifidelity deep operator networks. arXiv preprint arXiv:2204.09157, 2022.
 [34] MC Kennedy and A O’Hagan. Predicting the output from a complex computer code when fast approximations are available. Biometrika, 87(1):1–13, 03 2000.
 [35] Rohit K Tripathy and Ilias Bilionis. Deep uq: Learning deep neural network surrogate models for high dimensional uncertainty quantification. Journal of computational physics, 375:565–588, 2018.