In machine learning, models are unknown and must be constructed from data, while in computational fluid dynamics (CFD), models are known and well-defined by a set of partial differential equations. However, the solution—the distribution in space and time of fluid state variables such as density, velocity, momentum, and energy—is unknown until a simulation of the model has converged, which can take days or weeks on a supercomputer for a single design. Therefore, automatically iterating through new designs for optimization is often infeasible. What if engineers could predict solutions in seconds, given the information they already have from the simulations they have already run?
Reduced order models (ROMs) are the state-of-the-art model-based methods that address this problem in industry. ROMs solve the governing equations after projecting them onto a smaller manifold and performing some approximations, using previous simulations to construct a basis. However, ROMs have the following issues: they require full knowledge of the model used to generate the dataset, and, for highly nonlinear problems, they may not improve computation time. Most importantly, the main roadblock for the application of ROMs in industry is intrusiveness. The engineer must perform a non-trivial integration of the the ROM with the full order simulation software—often a decades old codebase—which requires high-level expertise in both the particular design application and in ROMs.
The problem has also been addressed in industry using (1) surrogate models of the simulation dynamics, which are often nearly as computationally expensive as the original simulation and (2) approximations for the scalar quantities of interest using surface fitting techniques, which are fast but often highly inaccurate. All current approaches have also been investigated in combination with neural networks with the aim of improving speed or accuracy, but such approaches usually have the same drawbacks as the original approach with which they were hybridized.
Here we address the problem by training stand-alone neural networks to learn the functions that approximate full solutions. On two different CFD test cases—a 1D Burgers’ equation and a 2D shock bubble—we reformulate the problems to make more efficient use of the data, and we show that fully connected feed-forward networks perform reasonably well, with some room for improvement. Then we introduce a new neural network architecture, the cluster network, with paired function and context networks, which addresses some limitations of the fully connected network. The cluster network has a stronger inductive bias— that the solutions we are approximating are made up of a small number of local, simple functions. We show that the cluster network competes with state-of-the-art ROMs in accuracy while running orders of magnitude faster. We conclude that the high accuracy needed for industry adoption of the method are within reach, and that future work potentially leads to a full process for aerodynamic design optimization.
2 Background and Related Work
Research at the intersection of CFD and machine learning has accelerated in the past few years as new methods based mainly on computation graphs have been investigated with the goal of obtaining fluid dynamics solutions more quickly and accurately. Fluid dynamics introduces a variety of its own unique datasets and problems where machine learning can potentially apply. In the following sections, we map out the areas of intersection between CFD and machine learning and we delineate where our proposed method fits. Our approach lies at a potentially high impact area of intersection not yet well explored, and where our model-free method competes with current model-based technology.
2.1 Model-Based Methods
2.1.1 Full Order Models
CFD Solvers: Industry CFD solvers simulate the Navier-Stokes equations—a set of partial differential equations governing fluid motion. In this field, the umbrella term fluid refers to both liquids and gases. Today’s high fidelity CFD solvers allow for deeper analysis and better aerodynamic design performance by avoiding the month-to-year long costs of performing physical experiments. Methods for improving the approximations made in the physical models and optimizing the codes have been investigated for decades with the goal of accelerating simulations. However, due to the high dimensionality required to resolve small scale physics, even the most highly optimized solver may take weeks on a supercomputer, which hinders investigations in design and engineering applications. Therefore, machine learning methods for dramatically reducing the computational cost of obtaining solutions while preserving sufficient accuracy are an active and high impact area of research within the aerospace field.
NN Solvers: Neural network (NN) architectures can be constructed to perform steps that are similar to a CFD solver after training on simulation data. Early work using neural networks to solve partial differential equations (PDEs) with a focus on fluid dynamics has expanded over time (Lagaris et al., 1998) (Parisi et al., 2003) (Tsoulos et al., 2009) (Mall & Chakraverty, 2013)
. Eventually, these types of solvers matured into physics informed surrogate deep learning models(Raissi & Karniadakis, 2018) and (Raissi et al., 2019). These iterative neural network solvers do not gain much in terms of speed, with simulation times on the same order of magnitude as the original solver. They are mainly a proof of concept that neural networks can approximate CFD solver computations, sacrificing very little in terms of accuracy. However, these methods struggle to compete with the more than 50 years of optimization behind standard PDE solvers. Instead, methods in PDE solving may be more useful the machine learning community for faster neural network training (Chen et al., 2018).
CFD Solver / NN Solver Hybrids: In CFD and NN hybrid technology, the neural network learns terms that are hard for the CFD solver or unknown, like constitutive material models (Javadi et al., 2003) or turbulence closure terms (Ling et al., 2016). These model additions can improve the accuracy of simulation, while incuring some computational cost.
CNN Solvers:Convolutional neural network (CNN) based surrogate models are a clearer win for neural networks—methods in (Guo et al., 2016) and (Stoecklein et al., 2017) were found to dramatically improve solver CPU time, while sacrificing very little accuracy. The CNN surrogate models tend to be a couple orders of magnitude faster than a standard solver. However, the drawback of these methods is that they can only be applied to spatially uniform grids—a very small, almost nonexistent subset of aerospace industry problems, as visualized in Figure 2. One field where these CNN based methods are usefully applied is for the acceleration of computer graphics simulations for Eulerian fluids (Tompson et al., 2016).
2.1.2 Reduced Order Models
ROMs: ROMs bring together techniques from machine learning and PDE solving to create a strategy for reducing the computational cost of CFD while retaining high accuracy (Amsallem & Farhat, 2008) . ROMs simulate the original governing equations after projecting them into a low dimensional solution manifold, a basis, and by introducing other approximations into the governing equations, reducing computation cost often an order of magnitude or more. ROMs use knowledge gained from previous simulations to construct the basis, and can begin to approximate solutions well starting from a handful of examples—hence, they are often useful for industry problems where getting extremely large CFD datasets for training is infeasible. (Carlberg et al., 2011) (Zahr et al., 2018). ROMs also utilize a greedy point selection method for choosing more design parameters at which to run full order simulations in order to explore the design parameter space, which allows them to be used for optimization problems. But ROMs have serious issues: for highly nonlinear problems, ROMs struggle with accuracy and prohibitively high computational costs on the order of the full order model. More importantly, they are intrusive. The ROM must be integrated into the governing PDE within an existing codebase, so even for problems where they may apply well, they are often not attempted.
ROM / NN Hybrids: In hybrid models, combining ROMs and neural networks, the neural network is trained to model the terms that are hard for the ROM or unknown, resulting in more accurate ROMs with some computational overhead. Some methods may learn the projection errors (San & Maulik, 2018) or employ neural networks to accurately approximate the coefficients of the reduced model (Hesthaven & Ubbiali, 2018). However, these hybrid methods still suffer the most critical shortcomings of ROMs.
2.2 Model-Free Methods
2.2.1 Quantity of Interest Approximation
Surface Fitting: Surface fitting on a quantity of interest (QoI) simply maps design parameters directly to QoIs using linear or non-linear fitting. These methods are non-intrusive and simple, but inaccurate (Chilenski et al., 2015). Given large datasets, neural networks have been investigated as methods for mapping, and under certain conditions where large enough datasets are available, they improve upon more conventional curve fitting methods (Trizila et al., 2011). Along with running as many full order models as computationally feasible, some form of surface fitting is one of the most common methods employed for design optimization in industry. Surface fitting is employed where state-of-the-art methods like ROMs are not expected to improve simulation time, or where the intrusiveness is too prohibitive.
2.2.2 Full Order Solution Approximation
Solution approximation using neural networks is proposed in this paper for learning from few examples. A very large CFD dataset is not required—only a handful of example solutions. Rather than accelerating CFD simulations or learning an iterative method for simulating the physics, the method uses a simple network to learn the function that approximates the CFD solutions for a particular test case in space and time in a way that generalizes well from only a few examples. Using this method, solutions for very large number of design parameters can be computed nearly instantaneously with one forward pass of a small network, beating ROMs in speed, and competing with ROMs in accuracy. This method is unlike surface fitting in that the network does not only learn a QoI for a given design parameter, but the entire solution, from which any QoI can be computed. Because the QoI is learned and computed in the context of the surrounding fluid behavior, our approach is predicated on the assumption that the full order solution estimation will likely have higher accuracy on QoIs than surface fitting for the complex, highly nonlinear problems that are of interest to industry.
Because industry CFD simulations are expensive to compute, our datasets are designed to evaluate the network on its ability to learn well from very few examples. A dataset consisting of 40 simulations, but with variations in 5-10 design parameters would be a typical industry dataset. Therefore, we test each method on scaled down version of a similarly hard problem in order to facilitate faster network evaluation. The networks are given the challenge of learning from only three examples from each dataset.
In order to accomplish this task, we make a small data regime seem like a big data regime by learning the function that approximates the solution, for which there is a lot of training data within each example in space and time, as visualized in Figure 3. We learn a function that takes an individual point in time and space (of which there are hundreds of thousands to millions), as well as the design parameters, and returns the solution at that point.
This paper focuses on a test case typically used to benchmark ROMs and evaluate their potential applicability to industry CFD problems—Burgers’ equation. A second shock bubble test case is also studied to show that the proposed methods scale up to high-dimensional problems; we expect the methods to compete in speed even more favorably at higher dimensions.
The industry CFD problems we ultimately intend to address involve expensive simulations on the order of days or weeks, while the toy CFD problems we study only take only seconds to minutes to converge, but the physical dynamics are the same. Also, while the compressible Navier-Stokes equations used to model the shock bubble test case are more complex than Burgers’ equation, the solutions are far smoother than the Burgers’ equation solutions over most of the dataset—with a strong shock present only in the first few hundredths of a second of the simulation. While the simulation of this problem requires a more advanced solver, the solution is not a highly nonlinear function over most of the domain. Therefore, between the two test cases, Burgers’ solutions are more representative in important ways of the problem space the proposed methods are intended to address—the highly nonlinear solutions that take days or weeks for industry solvers to obtain.
3.1 Burgers’ Equation Test Case
The Burgers’ equation test case shown in Figure 4 is commonly used for evaluating ROMs. It is a one-dimensional application of an initial-boundary-value problem, which models the movement of a shockwave across a tube.
For this test case, the network will see three solutions examples in the training set as a single design parameter, fluid viscosity , is set to three different values ( = 1.0, 3.0, 4.0). Once trained, the network will be asked to predict solutions at different unseen parameters for two examples ( = 2.0, 5.0) in the test set. Inputs and outputs are shown in Figure 5.
3.2 Shock Bubble Test Case
The shock bubble test case is a 2D application of the compressible Navier-Stokes equations, modeling a shockwave moving across a circular high density region representing a 2D bubble. In order to generate the dataset, parameters in the governing equations were modified within a CFD solver, including the Mach number (), ratio of specific heats, and viscosity. Solutions were generated resolving 0.2 seconds in real time.
At the beginning of the simulation—0.000 seconds—the problem is defined by a high density “bubble” 5 times the density of the ambient fluid, as visualized in the left-most frames of Figure 6. A moving shockwave is initialized near the left side of the simulation, represented by a small region with high x-momentum, which will move to the right and cross over the bubble. A Mach number greater than 1.0 indicates that the shockwave itself moves at higher than supersonic speed. The four state variables—density, x-momentum, y-momentum, and energy are plotted at t = 0.000s through t = 0.195s in Figure 6 for a shockwave Mach number of 1.8.
The shockwave Mach number, a parameter initialized at the beginning of each simulation, which controls how fast the shockwave moves over the bubble, is varied to generate a training dataset with only three solutions ( = 1.4, 2.0, 5.0) and a test dataset with two solutions ( = 1.8, 3.0) . A single snapshot of density at a random time point— 0.180 seconds into the simulation—is shown in Figure 7 in order to visualize the variation in fluid behavior across the three points in the training set and the two points in the test set. Only the effect of Mach number on density is estimated and reported in these results.
The proposed architectures manage the fact that few examples of design parameters are given by learning the function that approximates the training set solutions. Variables are mapped from () values, which are known, to a fluid state variable of interest, , which is unknown. Once either network architecture is constructed, it is trained on the dataset. Predictions for unseen design parameters in the test set are then computed, and percent error is tabulated and compared to baselines.
4.1 Fully Connected Network
The first proposed architecture is the simple feed-forward network in Figure 8. The network performs a mapping directly from a batch of () values to a batch of values where represents space, represents time, and represents the output of the network. For the Burgers’ equation test case, represents the velocity.
For the shock bubble test case, may represent the density, x-momentum, y-momentum, or energy. Also, can represents any design variable or parameter in the governing equation than can be easily modified from one simulation to the next. For the Burgers’ equation test case, represents viscosity. For the shock bubble test case, Mach number is varied instead of . Also, the shock bubble test case is 2D, so the mapped variables are ().
Because and can be any scalar value, this method also makes the architecture easy to apply to any time stepping procedure, grid type, or coarseness—avoiding the issue that CNNs only perform well on spatially invariant grids. This network architecture is biased to assume smooth functions, which is appropriate for many functions that define CFD solutions, but a stronger inductive bias is necessary for an optimal approach.
4.2 Cluster Network
The second proposed architecture is shown in Figure 9
. The network is unique in that different clusters automatically identify regions in the perceived data that behave according to different functions (the function networks). A separate part of the architecture determines when and by how much to turn the other parts on or off (the context networks). Different loss functions are used for each part of the network to train them as either function or context networks.
The separation in intent between different networks is similar to a mixture of experts network (Shazeer et al., 2017) and similar in construction to the recent AI Physicist network (Wu & Tegmark, 2018). Because each cluster applies to a different function, and because each type of network has a separate role, the network is more interpretable than a fully connected network. This network design makes an additional inductive bias over the smooth functions assumed by the fully connected network—that fluid dynamics solutions can be broken down into a few regions, defined individually by very simple, smooth functions.
This network requires a specialized training procedure with the function networks trained by a classification loss function, and the context networks trained successively by a regression loss function. Using the labels in Figure 9, the training procedure for a network with two clusters is (1) train the function networks using a classification loss function, defined as the minimum of the absolute values and (2) train the cluster networks while holding the function network weights constant using a regression loss function, defined as the absolute value and (3) optionally, train both networks simultaneously using the loss function from step 2. While an L1 norm loss is described here, the network also generalizes well using an L2 norm.
5 Numerical Results
5.1 Burgers’ Equation Results
5.1.1 Fully Connected Network Results
With the right hyper-parameter tuning and the implementation of exponential learning rate decay, the single fully connected network learns to fit the data and predict solution behavior at unseen parameters as shown in Figure 10
. Note that the method not only interpolates, but extrapolates to some extent beyond parameters it has seen—this is a surprising result as we only expected neural networks to interpolate well, and the method is only required to interpolate well to be part of a design optimization procedure. The fact that the network extrapolates at all is a non-trivial bonus for the procedure.
5.1.2 Cluster Network Results
The cluster network model was also trained on only three examples, and tested on two examples. Overlays of CFD solutions with neural network predictions in Figure 11 show that the neural network predictions closely approximate the full order model solutions, interpolating and extrapolating better than the fully connected network.
5.1.3 Results Compared to Baselines
Results for all baseline and proposed methods are tabulated in Table 1 for the Burgers’ equation test case. ROMs, fully connected networks and cluster networks all predict solutions at unseen conditions well, and predict the QoIs calculated from the full solutions well. The fully connected network and cluster network also obtain predictions about two orders of magnitude faster than the full order model.
Some QoIs were constructed for more meaningful comparisons because in practice there are usually QoIs that can be computed from a simulation that are important to a design engineer in addition to the full solution. Therefore, four psuedo-QoI’s representative of the kinds of quantities engineers might investigate were calculated from the test set (1) a single random solution point in space and time (2) the average value of the solution at a random fixed point in time (3) the average value of the solution at a random fixed point in space and (4) the average value of the solution over space and time. Similar QoIs were constructed for both Burgers’ equation and the shock bubble test case. The final tabulated QoI is the average of the four psuedo-QoIs.
The percent error is calculated by (1) scaling the values for all solutions in the dataset to between 0 and 1, (2) taking the absolute value of the difference between the predicted solutions and solutions computed by the full order model, and (3) multiplying by 100. This percent error metric provides a measure of the difference between the prediction and the ground truth, represented as a percent of the maximum value after the variables have been scaled between 0 and 1. The percent errors are the same as the scaled L1 errors typically investigated in machine learning applications, except multiplied by 100. The QoI percent error is calculated with the same procedure, except in step (2) we take the absolute value of the difference between the predicted QoI and the QoI computed by the full order model.
|Method||CPU (s)||% Error||QoI % Error|
|FCN 4 20||0.245||2.43||3.04|
|ClusterNet 2 3 5||0.227||0.71||0.42|
|QoI Poly Fit 3||0.001||7.28|
|QoI Poly Fit 2||0.001||7.15|
|QoI Poly Fit 1||0.001||8.42|
|QoI Poly Fit 0||0.001||15.03|
Table with comparisons with baselines. Online CPU time for the test set, percent error on the test set, and percent error on selected quantities of interest from the test set are shown for different methods compared to the full order model. We use abbreviated method names, for example: Full order model (FOM), ROM with 100 basis vectors (ROM 100), Fully Connected Network with 4 layers and 25 nodes per layer (FCN 4 20), Cluster Network with 2 clusters 3 layers 5 nodes per layer (ClusterNet 2 3 5), Surface fitting using 1st order polynomial on a QoI (QoI Poly Fit 1).
5.2 Shock Bubble Results
5.2.1 Visualization of Results
In order to visualize how well the network approximates the shock bubble data, a random time point near the end of the simulation at 0.180 seconds is taken, and the density is visualized on a contour plot in and for each of the initialized shockwave Mach numbers in the training and test sets for the CFD solutions (the truth) and the neural network predictions (the guesses). The contour plots in Figure 13 show how well this relatively small network predicts the solutions at Mach numbers that it never saw, and the contour plots in Figure 13 show that the cluster network also performs well on the same problem.
5.2.2 Results Compared to Baselines
The purpose of the shock bubble test case is to show that our methods scale well to higher-dimensional problems. Therefore, we compare our shock bubble results to the full order model and the QoI baselines in Table 2. Again, the fully connected network and the cluster network outperform the model-free surface fitting methods. In addition, these results show that, as predicted, the neural network-based approach has an even greater speedup over the full order model on higher-dimensional problems. Due to the intrusiveness of the ROM approach and the sophistication of the simulation software, we do not have ROM baseline for this test case, but fortunately ROM results are not required to show that the method scales well.
|Method||CPU (s)||QoI % Error|
|FCN 4 40||0.687||0.37|
|ClusterNet 4 3 15||0.524||0.88|
|QoI Poly Fit 3||0.001||5.73|
|QoI Poly Fit 2||0.001||5.29|
|QoI Poly Fit 1||0.001||1.78|
|QoI Poly Fit 0||0.001||1.79|
Both neural network architectures tested are more accurate than surface fitting, and are an order of magnitude faster with comparable accuracy to ROMs. Between the two networks, the fully connected network is a good baseline to use for comparison with other neural network architectures for a few reasons: (1) the architecture is simple, (2) the training procedure is simple and (3) the inductive bias assumes smooth functions. However, in practice the network requires extensive hyper-parameter search and tuning. The cluster network is more accurate than the fully connected network for the Burgers’ equation test case, and it provides additional benefits: (1) it uses fewer weights, (2) the forward pass is faster than the fully connected network and (3) its inductive bias is more applicable to highly nonlinear CFD problems—a smooth local bias. Ultimately, both network architectures perform well, and the percent error is under 1% on the two test cases for the most successful model—the cluster network.
7 Future Work
Future neural network architectures with inductive biases even better suited to fluid dynamics are likely to achieve more accurate results, and compete with model-based methods like ROMs across a wider range of problems. Additionally, demonstrations on high-dimensional industry test cases are necessary for our method to be adopted within industry for design and optimization. Finally, the solution-predicting is only part of the optimization procedure. In order for these tools to ultimately be useful for design and optimization, they will need to be paired with a process for mapping out how quantities of interest vary over the design parameter space. Future work would require a point selection method like the greedy point selection method used in ROMs—a method for choosing the next points for running the expensive simulations, which balances exploration and exploitation. With the addition of a point selection method, we would have a complete procedure for engineers to use for aerodynamic design optimization.
This research is supported by the Department of Energy Computational Science Graduate Fellowship (DOE CSGF) program through US DOE Grant DE-FG02-97ER25308. At Lawrence Berkeley National Laboratory, Ann Almgren provided the AMReX full order model code used to generate the shock bubble test case and Matt Zahr provided the pyMORTestbed full order model and reduced order model code used to generate and evaluate the Burgers’ equation test case. We are grateful to Ann Almgren, Daniel Selsam, and Nathaniel Thomas for text revisions and insights.
- Amsallem & Farhat (2008) Amsallem, D. and Farhat, C. Interpolation method for adapting reduced-order models and application to aeroelasticity. AIAA journal, 46(7):1803–1813, 2008.
Carlberg et al. (2011)
Carlberg, K., Bou-Mosleh, C., and Farhat, C.
Efficient non-linear model reduction via a least-squares petrov–galerkin projection and compressive tensor approximations.International Journal for Numerical Methods in Engineering, 86(2):155–181, 2011.
- Chen et al. (2018) Chen, T. Q., Rubanova, Y., Bettencourt, J., and Duvenaud, D. Neural ordinary differential equations. arXiv preprint arXiv:1806.07366, 2018.
- Chilenski et al. (2015) Chilenski, M., Greenwald, M., Marzouk, Y., Howard, N., White, A., Rice, J., and Walk, J. Improved profile fitting and quantification of uncertainty in experimental measurements of impurity transport coefficients using gaussian process regression. Nuclear Fusion, 55(2):023012, 2015.
- Guo et al. (2016) Guo, X., Li, W., and Iorio, F. Convolutional neural networks for steady flow approximation. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 481–490. ACM, 2016.
- Hesthaven & Ubbiali (2018) Hesthaven, J. S. and Ubbiali, S. Non-intrusive reduced order modeling of nonlinear problems using neural networks. Journal of Computational Physics, 363:55–78, 2018.
- Javadi et al. (2003) Javadi, A., Tan, T., and Zhang, M. Neural network for constitutive modelling in finite element analysis. Computer Assisted Mechanics and Engineering Sciences, 10(4):523–530, 2003.
- Lagaris et al. (1998) Lagaris, I. E., Likas, A., and Fotiadis, D. I. Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks, 9(5):987–1000, 1998.
- Ling et al. (2016) Ling, J., Kurzawski, A., and Templeton, J. Reynolds averaged turbulence modelling using deep neural networks with embedded invariance. Journal of Fluid Mechanics, 807:155–166, 2016.
Mall & Chakraverty (2013)
Mall, S. and Chakraverty, S.
Comparison of artificial neural network architecture in solving ordinary differential equations.Advances in Artificial Neural Systems, 2013:12, 2013.
- Parisi et al. (2003) Parisi, D. R., Mariani, M. C., and Laborde, M. A. Solving differential equations with unsupervised neural networks. Chemical Engineering and Processing: Process Intensification, 42(8-9):715–721, 2003.
- Raissi & Karniadakis (2018) Raissi, M. and Karniadakis, G. E. Hidden physics models: Machine learning of nonlinear partial differential equations. Journal of Computational Physics, 357:125–141, 2018.
- Raissi et al. (2019) Raissi, M., Perdikaris, P., and Karniadakis, G. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019.
- San & Maulik (2018) San, O. and Maulik, R. Neural network closures for nonlinear model order reduction. Advances in Computational Mathematics, pp. 1–34, 2018.
- Shazeer et al. (2017) Shazeer, N., Mirhoseini, A., Maziarz, K., Davis, A., Le, Q., Hinton, G., and Dean, J. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. arXiv preprint arXiv:1701.06538, 2017.
- Stoecklein et al. (2017) Stoecklein, D., Lore, K. G., Davies, M., Sarkar, S., and Ganapathysubramanian, B. Deep learning for flow sculpting: Insights into efficient learning using scientific simulation data. Scientific reports, 7:46368, 2017.
- Tompson et al. (2016) Tompson, J., Schlachter, K., Sprechmann, P., and Perlin, K. Accelerating eulerian fluid simulation with convolutional networks. arXiv preprint arXiv:1607.03597, 2016.
- Trizila et al. (2011) Trizila, P., Kang, C.-K., Aono, H., Shyy, W., and Visbal, M. Low-reynolds-number aerodynamics of a flapping rigid flat plate. AIAA journal, 49(4):806–823, 2011.
- Tsoulos et al. (2009) Tsoulos, I. G., Gavrilis, D., and Glavas, E. Solving differential equations with constructed neural networks. Neurocomputing, 72(10-12):2385–2391, 2009.
- Wu & Tegmark (2018) Wu, T. and Tegmark, M. Toward an ai physicist for unsupervised learning. arXiv preprint arXiv:1810.10525, 2018.
- Zahr et al. (2018) Zahr, M. J., Carlberg, K. T., and Kouri, D. P. An efficient, globally convergent method for optimization under uncertainty using adaptive model reduction and sparse grids. arXiv preprint arXiv:1811.00177, 2018.