In the last 15 years, deep learning in the form of deep neural networks (NNs), has been used very effectively in diverse applications 
, such as computer vision and natural language processing. Despite the remarkable success in these and related areas, deep learning has not yet been widely used in the field of scientific computing. However, more recently, solving partial differential equations (PDEs) via deep learning has emerged as a potentially new sub-field under the name of Scientific Machine Learning (SciML).
To solve a PDE via deep learning, a key step is to constrain the neural network to minimize the PDE residual, and several approaches have been proposed to accomplish this. Compared to the traditional mesh-based methods, such as the finite difference method (FDM) and the finite element method (FEM), deep learning could be a mesh-free approach by taking advantage of the automatic differentiation 
, and could break the curse of dimensionality[28, 12]. Among these approaches, some can only be applied to particular types of problems, such as image-like input domain [16, 21, 39] or parabolic PDEs [4, 13]. Some researchers adopt the variational form of PDEs and minimize the corresponding energy functional [10, 14]. However, not all PDEs can be derived from a known functional, and thus Galerkin type projections have also been considered . Alternatively, one could use the PDE in strong form directly [9, 33, 18, 19, 5, 32, 30]; in this form, automatic differentiation could be used directly to avoid truncation errors and the numerical quadrature errors of variational forms. This strong form approach was introduced in  coining the term physics-informed neural networks (PINNs). An attractive feature of PINNs is that it can be used to solve inverse problems with minimum change of the code for forward problems [30, 31]. In addition, PINNs have been further extended to solve integro-differential equations (IDEs), fractional differential equations (FDEs) , and stochastic differential equations (SDEs) [38, 36, 24, 37].
In this paper, we present various PINN algorithms implemented in a Python library DeepXDE111Source code is published under the Apache License, Version 2.0 on GitHub. https://github.com/lululxvi/deepxde, which is designed to serve both as an education tool to be used in the classroom as well as a research tool for solving problems in computational science and engineering (CSE). DeepXDE can be used to solve multi-physics problems, and supports complex-geometry domains based on the technique of constructive solid geometry (CSG), hence avoiding tedious and time-consuming computational geometry tasks. By using DeepXDE, time-dependent PDEs can be solved as easily as steady states by only defining the initial conditions. In addition to the main workflow of DeepXDE, users can readily monitor and modify the solution process via callback functions, e.g., monitoring the Fourier spectrum of the neural network solution, which can reveal the leaning mode of the NN Fig. 2. Last but not least, DeepXDE is designed to make the user code stay compact and manageable, resembling closely the mathematical formulation.
The paper is organized as follows. In Section 2, after briefly introducing deep neural networks, we present the algorithm, approximation theory, and error analysis of PINNs, and make a comparison between PINNs and FEM. We then discuss how to use PINNs to solve integro-differential equations and inverse problems. In addition, we propose the residual-based adaptive refinement (RAR) method to improve the training efficiency of PINNs. In Section 3, we introduce the usage of our library, DeepXDE, and its customizability. In Section 4, we demonstrate the capability of PINNs and friendly use of DeepXDE for five different examples. Finally, we conclude the paper in Section 5.
2 Algorithm and theory of physics-informed neural networks
In this section, we first provide a brief overview of deep neural networks, and present the algorithm and theory of PINNs for solving PDEs. We then make a comparison between PINNs and FEM, and discuss how to use PINNs to solve integro-differential equations and inverse problems. Next we propose RAR, an efficient way to select the residual points adaptively during the training process.
2.1 Deep neural networks
Mathematically, a deep neural network is a particular choice of a compositional function. The simplest neural network is the feed-forward neural network (FNN), also called multilayer perceptron (MLP), which applies linear and nonlinear transformations to the inputs recursively. Although many different types of neural networks have been developed in the past decades, such as the convolutional neural network and the recurrent neural network. In this paper we consider FNN, which is sufficient for most PDE problems, and residual neural network (ResNet), which is easier to train for deep networks. However, it is straightforward to employ other types of neural networks.
Let be a -layer neural network, or a -hidden layer neural network, with neurons in the -th layer (,
). Let us denote the weight matrix and bias vector in the-th layer by and
, respectively. Given a nonlinear activation function, which is applied element-wisely, the FNN is recursively defined as follows:
see also a visualization of a neural network in Fig. 1. Commonly used activation functions include the logistic sigmoid , the hyperbolic tangent ().
2.2 Physics-informed neural networks for solving PDEs
We consider the following PDE parameterized by for the solution with defined on a domain :
with suitable boundary conditions
where could be Dirichlet, Neumann, Robin, or periodic boundary conditions. For time-dependent problems, we consider time as a special component of , and contains the temporal domain. The initial condition can be simply treated as a special type of Dirichlet boundary condition on the spatio-temporal domain.
The algorithm of PINN [19, 30] is shown in Procedure 1, and visually in the schematic of Fig. 1 solving a diffusion equation with mixed boundary conditions on and on . We explain each step as follows. In a PINN, we first construct a neural network as a surrogate of the solution , which takes the input and outputs a vector with the same dimension as . Here, is the set of all weight matrices and bias vectors in the neural network . One advantage of PINNs by choosing neural networks as the surrogate of is that we can take the derivatives of with respect to its input1]
In the next step, we need to restrict the neural network to satisfy the physics imposed by the PDE and boundary conditions. It is hard to restrict in the whole domain, but instead we restrict on some scattered points, i.e., the training data of size . In addition, is comprised of two sets and , which are the points in the domain and on the boundary, respectively. We refer and as the sets of “residual points”.
To measure the discrepancy between the neural network and the constraints, we consider the loss function defined as the weighted summation of the norm of residuals for the equation and boundary conditions:
and and are the weights. The loss involves derivatives, such as the partial derivative or the normal derivative at the boundary , which are handled via AD.
In the last step, the procedure of searching for a good by minimizing the loss is called “training”. Considering the fact that the loss is highly nonlinear and non-convex with respect to , we usually minimize the loss function by gradient-based optimizers, such as gradient descent, Adam , and L-BFGS .
In the algorithm of PINN introduced above, we enforce soft constraints of boundary/initial conditions through the loss . This approach can be used for complex domains and any type of boundary conditions. On the other hand, it is possible to enforce hard constraints for simple cases . For example, when the boundary condition is with , we can simply choose the surrogate model as to satisfy the boundary condition automatically, where is a neural network.
We note that it is very flexible to choose the residual points , and here we provide three possible strategies:
We specify the residual points at the beginning of training, which could be grid points on a lattice or random points, and never change them during the training process.
In each optimization iteration, we select randomly different residual points.
We improve the location of the residual points adaptively during training, e.g., the method proposed in Section 2.7.
When the number of residual points is very large, it is computationally expensive to calculate the loss and gradient in every iteration. Instead of using all residual points, we can split the residual points into small batches, and in each iteration we only use one batch to calculate the loss and update model parameters, which is called mini-batch gradient descent. The aforementioned strategy (2), i.e., re-sampling in each step, is a special case of mini-batch gradient descent by choosing with .
Recent studies show that for function approximation, neural networks learn target functions from low to high frequencies [29, 35], but here we show that the learning mode of PINNs is different due to the existence of high-order derivatives. For example, when we approximate the function in by a NN, the function is learned from low to high frequency (Fig. 2A). However, when we employ a PINN to solve the Poisson equation with zero boundary conditions in the same domain, all frequencies are learned almost simultaneously (Fig. 2B). Interestingly, by comparing Fig. 2A and Fig. 2B we can see that at least in this case solving the PDE using a PINN is faster than approximating a function using a NN. We can monitor this training process using the callback functions in our library DeepXDE as discussed later.
2.3 Approximation theory and error analysis for PINNs
One fundamental question related to PINNs is whether there exists a neural network satisfying both the PDE equation and the boundary conditions, i.e., whether there exists a neural network that can simultaneously and uniformly approximate a function and its partial derivatives. To address this question, we first introduce some notation. Let be the set of -dimensional nonnegative integers. For , we set , and
We say if for all , . Then, we recall the following theorem of derivative approximation using single hidden layer neural networks due to Pinkus .
Let , , and set . Assume and is not a polynomial. Then the space of single hidden layer neural nets
is dense in
i.e., for any , any compact , and any , there exists a satisfying
for all for which for some .
Theorem 1 shows that feed-forward neural nets with enough neurons can simultaneously and uniformly approximate any function and its partial derivatives. However, neural networks in practice have limited size. Let denote the family of all the functions that can be represented by our chosen neural network architecture. The solution is unlikely to belong to the family , and we define as the best function in close to (Fig. 3). Because we only train the neural network on the training set , we define as the neural network whose loss is at global minimum. For simplicity, we assume that , and are well defined and unique. Finding by minimizing the loss is often computationally intractable , and our optimizer returns an approximate solution .
We can then decompose the total error as 
The approximation error measures how closely can approximate . The generalization error is determined by the number/locations of residual points in and the capacity of the family
. Neural networks of larger size have smaller approximation errors but could lead to higher generalization errors, which is called bias-variance tradeoff. Overfitting occurs when the generalization error dominates. In addition, the optimization errorstems from the loss function complexity and the optimization setup, such as learning rate and number of iterations.
2.4 Comparison between PINNs and FEM
To further explain the ideas of PINNs and to help those with the knowledge of FEM understand PINNs more easily, we make a comparison between PINNs and FEM point by point (Table 1):
In FEM we approximate the solution by a piecewise polynomial with point values to be determined, while in PINNs we construct a neural network as the surrogate model parameterized by weights and biases.
FEM typically requires a mesh generation, while PINN is totally mesh-free, and we can use either a grid or random points.
FEM converts a PDE to an algebraic system, using the stiffness matrix and mass matrix, while PINN embeds the PDE and boundary conditions into the loss function.
In the last step, the algebraic system in FEM is solved exactly by a linear solver, but the network in PINN is learned by a gradient-based optimizer.
At a more fundamental level, PINNs provide a nonlinear approximation to the function and its derivatives whereas FEM represent a linear approximation.
|Basis function||Neural network (nonlinear)||Piecewise polynomial (linear)|
|Parameters||Weights and biases||Point values|
|Training points||Scattered points (mesh-free)||Mesh points|
|PDE embedding||Loss function||Algebraic system|
|Parameter solver||Gradient-based optimizer||Linear solver|
|Errors||, and (Section 2.3)||Approximation/quadrature errors|
2.5 PINNs for solving integro-differential equations
When solving integro-differential equations (IDEs), we still employ the automatic differentiation technique to analytically derive the integer-order derivatives, while we approximate integral operators numerically using classical methods (Fig. 4) , such as Gaussian quadrature. Therefore, we introduce a fourth error component, the discretization error , due to the approximation of the integral by Gaussian quadrature.
For example, when solving
we first use Gaussian quadrature of degree to approximate the integral
and then we use a PINN to solve the following PDE instead of the original equation
2.6 PINNs for solving inverse problems
In inverse problems, there are some unknown parameters in Eq. 1, but we have some extra information on some points besides the differential equation and boundary conditions:
PINNs solve inverse problems as easily as forward problems. The only difference between solving forward and inverse problems is that we add an extra loss term to Eq. 2:
We then optimize and together, and our solution is .
2.7 Residual-based adaptive refinement (RAR)
As we discussed in Section 2.2, the residual points are usually randomly distributed in the domain. This works well for most cases, but it may not be efficient for certain PDEs that exhibit solutions with steep gradients. Take the Burgers equation as an example, intuitively we should put more points near the sharp front to capture the discontinuity well. However, it is challenging, in general, to design a good distribution of residual points for problems whose solution is unknown. To overcome this challenge, we propose a residual-based adaptive refinement (RAR) method to improve the distribution of residual points during training process (Procedure 2), conceptually similar to FEM refinement methods . The idea of RAR is that we will add more residual points in the locations where the PDE residual is large, and we repeat adding points until the mean residual
is smaller than a threshold , where is the volume of .
Select the initial residual points , and train the neural network for a limited number of iterations.
Stop if . Otherwise, add new points with the largest residuals in to , and go to Step 2.
3 DeepXDE usage and customization
In this section, we introduce the usage of DeepXDE and how to customize DeepXDE to meet new demands.
DeepXDE makes the code stay compact and nice, resembling closely the mathematical formulation. Solving differential equations in DeepXDE is no more than specifying the problem using the build-in modules, including computational domain (geometry and time), PDE equations, boundary/initial conditions, constraints, training data, neural network architecture, and training hyperparameters. The workflow is shown in Procedure3 and Fig. 5.
In DeepXDE, The built-in primitive geometries include interval, triangle, rectangle, polygon, disk, cuboid and sphere. Other geometries can be constructed from these primitive geometries using three boolean operations: union (|), difference (-) and intersection (&). This technique is called constructive solid geometry (CSG), see Fig. 6 for examples. CSG supports both two-dimensional and three-dimensional geometries.
DeepXDE supports four standard boundary conditions, including Dirichlet (DirichletBC), Neumann (NeumannBC), Robin (RobinBC), and periodic (PeriodicBC). The initial condition can be defined using IC. There are two types of neural networks available in DeepXDE: feed-forward neural network (maps.FNN) and residual neural network (maps.ResNet). It is also convenient to choose different training hyperparameters, such as loss types, metrics, optimizers, learning rate schedules, initializations and regularizations.
In addition to solving differential equations, DeepXDE can also be used to approximate functions from a dataset with constraints, and approximate functions from multi-fidelity data using the method proposed in .
All the components of DeepXDE are loosely coupled, and thus DeepXDE is well-structured and highly configurable. In this subsection, we discuss how to customize DeepXDE to meet the new demands.
As we introduced above, DeepXDE has already supported 7 basic geometries and the CSG technique. However, it is still possible that the user needs a new geometry, which cannot be constructed in DeepXDE. In this situation, a new geometry can be defined as shown in Procedure 4.
3.2.2 Neural networks
DeepXDE currently supports two neural networks: feed-forward neural network (maps.FNN) and residual neural network (maps.ResNet). A new network can be added as shown in Procedure 5.
It is usually a good strategy to monitor the training process of the neural network, and then make modifications in real time, e.g., change the learning rate. In DeepXDE, this can be implemented by adding a callback function, and here we only list a few commonly used ones already implemented in DeepXDE:
, which saves the model after certain epochs or when a better model is found.
OperatorPredictor, which calculates the values of the operator applying on the outputs.
FirstDerivative, which calculates the first derivative of the outpus with respect to the inputs. This is a special case of OperatorPredictor with the operator being the first derivative.
, which dumps the movie of the function during the training progress, and/or the movie of the spectrum of its Fourier transform.
It is very convenient to add other callback functions, which will be called at different stages of the training process, see Procedure 6.
4 Demonstration examples
In this section, we use PINNs and DeepXDE to solve different problems. In all examples, we use the as the activation function, and the other hyperparameters are listed in Table 2. The weights , and in the loss function are set as 1. The codes of all examples are published in GitHub.
|Example||NN Depth||NN Width||Optimizer||Learning rate||# Iterations|
4.1 Poisson equation over an L-shaped domain
Consider the following two-dimensional Poisson equation over an L-shaped domain :
We choose 1200 and 120 random points drawn from a uniform distribution asand , respectively. The PINN solution is given in Fig. 7B. For comparison, we also present the numerical solution obtained by using the spectral element method (SEM)  (Fig. 7A). The result of the absolute error is shown in Fig. 7C.
4.2 RAR for Burgers equation
We consider the Burgers equation:
Let . Initially, we randomly select 2500 points (spatio-temporal domain) as the residual points, and then 40 more residual points are added adaptively via RAR developed in Section 2.7 with and . We compare the PINN solution with RAR and the PINN solution based on 2540 randomly selected training data (Fig. 8), and demonstrate that PINN with RAR can capture the discontinuity much better. For a comparison, the finite difference solutions using Crank-Nicolson scheme for space discretization and forward Euler scheme for time discretization are also shown in Fig. 8A.
4.3 Inverse problem for the Lorenz system
Consider the parameter identification problem of the following Lorenz system
with the initial condition , where , and are the three parameters to be identified from the observations at certain times. The observations are produced by solving the above system to using Runge-Kutta (4,5) with the underlying true parameters . We choose 400 uniformly distributed random points and 25 equispaced points as the residual points and , respectively. The evolution trajectories of , and are presented in Fig. 9A, with the final identified values of .
4.4 Inverse problem for diffusion-reaction systems
A diffusion-reaction system in porous media for the solute concentrations , and () is described by
where is the effective diffusion coefficient, and is the effective reaction rate. Because and depend on the pore structure and are difficult to measure directly, we estimate and based on 40000 observations of the concentrations and in the spatio-temporal domain. The identified () and (0.0971) are displayed in Fig. 9B, which agree well with their true values.
4.5 Volterra IDE
Here, we consider the first-order integro-differential equation of the Volterra type in the domain :
with the exact solution We solve this IDE using the method in Section 2.5, and approximate the integral using Gaussian-Legendre quadrature of degree 20. The relative error is , and the solution is shown in Fig. 10.
5 Concluding Remarks
In this paper, we present the algorithm, approximation theory, and error analysis of the physics-informed neural networks (PINNs) for solving different types of partial differential equations (PDEs). Compared to the traditional numerical methods, PINNs employ automatic differentiation to handle differential operators, and thus they are mesh-free. Unlike numerical differentiation, automatic differentiation does not differentiate the data and hence it can tolerate noisy data for training. We also discuss how to extend PINNs to solve other types of differential equations, such as integro-differential equations, and also how to solve inverse problems. In addition, we propose a residual-based adaptive refinement (RAR) method to improve the distribution of residual points during the training process, and thus increase the training efficiency.
To benefit both the education and the computational science communities, we have developed the Python library DeepXDE, an implementation of PINNs. By introducing the usage of DeepXDE, we show that DeepXDE enables user codes to be compact and follow closely the mathematical formulation. We also demonstrate how to customize DeepXDE to meet new demands. Our numerical examples for forward and inverse problems verify the effectiveness of PINNs and the capability of DeepXDE. Scientific machine learning is emerging as a new and potentially powerful alternative to classical scientific computing, so we hope that libraries such as DeepXDE will accelerate this development and will make it accessible to the classroom but also to other researchers who may find the need to adopt PINN-like methods in their research, which can be very effective especially for inverse problems.
Despite the aforementioned advantages, PINNs still have some limitations. For forward problems, PINNs are currently slower than finite elements but this can be alleviated via offline training [39, 34]. For long time integration, one can also use time-parallel methods to simultaneously compute on multiple GPUs for shorter time domains. Another limitation is the search for effective neural network architectures, which is currently done empirically by users; however, emerging meta-learning techniques can be used to automate this search, see [40, 11]. Moreover, while here we enforce the strong form of PDEs, which is easy to be implemented by automatic differentiation, alternative weak/variational forms may also be effective, although they require the use of quadrature grids. Many other extensions for multi-physics and multi-scale problems are possible across different scientific disciplines by creatively designing the loss function and introducing suitable solution spaces. For instance, in the five examples we present here, we only assume data on scattered points, however, in geophysics or biomedicine we may have mixed data in the form of images and point measurements. In this case, we can design a composite neural network consisting of one convolutional neural network and one PINN sharing the same set of parameters, and minimize the total loss which could be a weighted summation of multiple losses from each neural network.
This work is supported by the DOE PhILMs project (No. de-sc0019453), the AFOSR grant FA9550-17-1-0013, and the DARPA-AIRA grant HR00111990025.
-  M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., Tensorflow: A system for large-scale machine learning, in 12th USENIX Symposium on Operating Systems Design and Implementation, 2016, pp. 265–283.
-  M. Ainsworth and J. T. Oden, A posteriori error estimation in finite element analysis, vol. 37, John Wiley & Sons, 2011.
N. Baker, F. Alexander, T. Bremer, A. Hagberg, Y. Kevrekidis, H. Najm,
M. Parashar, A. Patra, J. Sethian, S. Wild, et al.,
Workshop report on basic research needs for scientific machine learning: Core technologies for artificial intelligence, tech. report, US DOE Office of Science, Washington, DC (United States), 2019.
-  C. Beck, W. E, and A. Jentzen, Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations, Journal of Nonlinear Science, (2017), pp. 1–57.
-  J. Berg and K. Nyström, A unified deep artificial neural network approach to partial differential equations in complex geometries, Neurocomputing, 317 (2018), pp. 28–41.
-  A. Blum and R. L. Rivest, Training a 3-node neural network is np-complete, in Advances in Neural Information Processing Systems, 1989, pp. 494–501.
-  L. Bottou and O. Bousquet, The tradeoffs of large scale learning, in Advances in Neural Information Processing Systems, 2008, pp. 161–168.
-  R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu, A limited memory algorithm for bound constrained optimization, SIAM Journal on Scientific Computing, 16 (1995), pp. 1190–1208.
-  M. Dissanayake and N. Phan-Thien, Neural-network-based approximations for solving partial differential equations, Communications in Numerical Methods in Engineering, 10 (1994), pp. 195–201.
-  W. E and B. Yu, The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems, Communications in Mathematics and Statistics, 6 (2018), pp. 1–12.
-  C. Finn, P. Abbeel, and S. Levine, Model-agnostic meta-learning for fast adaptation of deep networks, in Proceedings of the 34th International Conference on Machine Learning, 2017, pp. 1126–1135.
-  P. Grohs, F. Hornung, A. Jentzen, and P. Von Wurstemberger, A proof that artificial neural networks overcome the curse of dimensionality in the numerical approximation of black-scholes partial differential equations, arXiv preprint arXiv:1809.02362, (2018).
-  J. Han, A. Jentzen, and W. E, Solving high-dimensional partial differential equations using deep learning, Proceedings of the National Academy of Sciences, 115 (2018), pp. 8505–8510.
-  J. He, L. Li, J. Xu, and C. Zheng, ReLU deep neural networks and linear finite elements, arXiv preprint arXiv:1807.03973, (2018).
-  G. E. Karniadakis and S. J. Sherwin, Spectral/hp element methods for computational fluid dynamics, Oxford University Press, second ed., 2013.
-  Y. Khoo, J. Lu, and L. Ying, Solving parametric PDE problems with artificial neural networks, arXiv preprint arXiv:1707.03351, (2017).
-  D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, in International Conference on Learning Representations, 2015.
-  I. E. Lagaris, A. Likas, and D. I. Fotiadis, Artificial neural networks for solving ordinary and partial differential equations, IEEE Transactions on Neural Networks, 9 (1998), pp. 987–1000.
-  I. E. Lagaris, A. C. Likas, and D. G. Papageorgiou, Neural-network methods for boundary value problems with irregular boundaries, IEEE Transactions on Neural Networks, 11 (2000), pp. 1041–1049.
-  Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, 521 (2015), p. 436.
-  Z. Long, Y. Lu, X. Ma, and B. Dong, PDE-net: Learning PDEs from data, in International Conference on Machine Learning, 2018, pp. 3214–3222.
A. J. Meade Jr and A. A. Fernandez,
The numerical solution of linear ordinary differential equations by feedforward neural networks, Mathematical and Computer Modelling, 19 (1994), pp. 1–25.
-  X. Meng and G. E. Karniadakis, A composite neural network that learns from multi-fidelity data: Application to function approximation and inverse PDE problems, arXiv preprint arXiv:1903.00104, (2019).
-  M. A. Nabian and H. Meidani, A deep neural network surrogate for high-dimensional random partial differential equations, arXiv preprint arXiv:1806.02957, (2018).
-  G. Pang, L. Lu, and G. E. Karniadakis, fPINNs: Fractional physics-informed neural networks, SIAM Journal on Scientific Computing, (2019), p. to appear.
-  A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, Automatic differentiation in pytorch, (2017).
-  A. Pinkus, Approximation theory of the MLP model in neural networks, Acta Numerica, 8 (1999), pp. 143–195.
-  T. Poggio, H. Mhaskar, L. Rosasco, B. Miranda, and Q. Liao, Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review, International Journal of Automation and Computing, 14 (2017), pp. 503–519.
-  N. Rahaman, A. Baratin, D. Arpit, F. Draxler, M. Lin, F. A. Hamprecht, Y. Bengio, and A. Courville, On the spectral bias of neural networks, arXiv preprint arXiv:1806.08734, (2018).
-  M. Raissi, P. Perdikaris, and G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics, 378 (2019), pp. 686–707.
-  M. Raissi, A. Yazdani, and G. E. Karniadakis, Hidden fluid mechanics: A Navier-Stokes informed deep learning framework for assimilating flow visualization data, arXiv preprint arXiv:1808.04327, (2018).
-  J. Sirignano and K. Spiliopoulos, DGM: A deep learning algorithm for solving partial differential equations, Journal of Computational Physics, 375 (2018), pp. 1339–1364.
-  B. P. van Milligen, V. Tribaldos, and J. Jiménez, Neural network differential equation and plasma equilibrium solver, Physical Review Letters, 75 (1995), p. 3594.
-  N. Winovich, K. Ramani, and G. Lin, ConvPDE-UQ: Convolutional neural networks with quantified uncertainty for heterogeneous elliptic partial differential equations on varied domains, Journal of Computational Physics, 394 (2019), pp. 263–279.
-  Z.-Q. J. Xu, Y. Zhang, T. Luo, Y. Xiao, and Z. Ma, Frequency principle: Fourier analysis sheds light on deep neural networks, arXiv preprint arXiv:1901.06523, (2019).
-  L. Yang, D. Zhang, and G. E. Karniadakis, Physics-informed generative adversarial networks for stochastic differential equations, arXiv preprint arXiv:1811.02033, (2018).
-  D. Zhang, L. Guo, and G. E. Karniadakis, Learning in modal space: Solving time-dependent stochastic PDEs using physics-informed neural networks, arXiv preprint arXiv:1905.01205, (2019).
-  D. Zhang, L. Lu, L. Guo, and G. E. Karniadakis, Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems, arXiv preprint arXiv:1809.08327, (2018).
-  Y. Zhu, N. Zabaras, P.-S. Koutsourelakis, and P. Perdikaris, Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data, arXiv preprint arXiv:1901.06314, (2019).
-  B. Zoph and Q. V. Le, Neural architecture search with reinforcement learning, arXiv preprint arXiv:1611.01578, (2016).