1 Introduction
Solving partial differential equations (PDEs) has been the most ubiquitous tool to simulate complicated phenomena in applied sciences and engineering problems. Classical numerical methods include finite difference method LeVeque (2007), finite element method (FEM) Elman et al. (2014), discontinuous Galerkin method Cockburn et al. (2000), and spectral method Shen et al. (2011), which are typically designed for low dimensional PDEs and are well understood in terms of stability and accuracy. However, there are high dimensional PDEs such as Schrödinger equation in the quantum manybody problem Dirac (1981), HamiltonJacobiBellman equation in stochastic optimal control Bardi and CapuzzoDolcetta (2008), and nonlinear BlackScholes equation for pricing financial derivatives Hull (2009)
. Solving these equations is far out of the capability of classical numerical methods due to the curse of dimensionality, i.e., the number of unknowns grows exponentially fast as the dimension increases.
Until very recently, deeplearning based methods have been developed to solving these highdimensional PDEs; see E et al. (2017); Giuseppe and Matthias (2017); E and Yu (2018); Han et al. (2018); Raissi (2018); Sirignano and Spiliopoulos (2018); Hutzenthaler et al. (2019); Raissi et al. (2019); Beck et al. (2019); Cervera (2019); Fan et al. (2019); Khoo et al. (2019); Beck et al. (2020); Wang et al. (2020); Zang et al. (2020); Discacciati et al. (2020) for examples. Typically, there are three main ingredients (stages) of a deeplearning method for solving PDEs: (1) modeling: the loss (objective) function to be optimized; (2) architecture: the deep neural network (DNN) for function approximation; (3) optimization: the optimal set of parameters in the DNN which minimizes the loss function. By design, the number of parameters in DNNs grows at most polynomially in terms of dimension. Meanwhile, possibly highdimensional integrals in the loss function are approximated by MonteCarlo method. Therefore, by design, deep learning overcomes the curse of dimensionality. In practice, deep learning performs well for Schrödinger equation Giuseppe and Matthias (2017); Han et al. (2019), HamiltonJacobiBellman equation Han et al. (2018); E et al. (2017), and nonlinear BlackScholes equation Beck et al. (2019); Cervera (2019).
Typically, deep learning solves a PDE in the following way. For the given PDE, the loss function is modeled as the equation residual in the leastsquares sense Sirignano and Spiliopoulos (2018) or the variational form if exists E and Yu (2018). ResNet is often used as the network architecture He et al. (2015)
, which was tested to overcome the notorious problem of vanishing/exploding gradient. Afterwards, stochastic gradient descent method is used to find the optimal set of parameters in ResNet which minimizes the loss function. ResNet with the optimal set of parameters gives an approximation of the PDE solution.
In this work, we propose a deep mixed residual method (MIM) for solving highorder PDEs. In the modeling stage, by rewriting a given PDE into a firstorder system, we obtain a larger problem in the sense that both the PDE solution and its highorder derivatives are unknown functions to be approximated. This has analogs in classical numerical methods, such as local discontinuous Galerkin method Cockburn et al. (2000) and mixed finite element method Boffi et al. (2013)
. Compared to DGM, there are two more degrees of freedom in MIM:

In the loss function stage, one can choose different highorder derivatives into the set of unknown functions. Take biharmonic equation as an example. The set of unknown functions can include the PDE solution and its derivatives up to the third order, or only contain the PDE solution and its secondorder derivatives, and both choices have analogs in discontinuous Galerkin method Yan and Shu (2002); Cockburn et al. (2009). We then write the loss function as the sum of equation residuals in the leastsquares sense, very much in the same spirit as the leastsquares finite element method Bochev and Gunzburger (2015).

In the architecture stage, one can choose the number of networks to approximate the set of unknown functions. In one case, one DNN is used to approximate the PDE solution and other DNNs are used to approximate its highorder derivatives; in the other case, the PDE solution and its derivatives share nearly the same DNN.
These two degrees of freedom allow MIM to produce better approximations over DGM in all examples, including Poisson equation, MongeAmpére equation, biharmonic equation, and Kortewegde Vries (KdV) equation. In particular, MIM provides better approximations not only for the highorder derivatives but also for the PDE solution itself. It is worth mentioning that the usage of mixed residual in deep learning was first introduced for surrogate modeling and uncertainty quantification of a secondorder elliptic equation Zhu et al. (2019) and was later adopted in a deep domain decomposition method Li et al. (2019).
2 Deep mixed residual method
In this section, we introduce MIM and discuss its difference with DGM in terms of loss function and neural network structure.
2.1 Loss function
Consider a potentially timedependent nonlinear PDE over a bounded domain
(1) 
where denotes the boundary of . In DGM, the loss function is defined as the PDE residual in the leastsquares sense
(2) 
where and are penalty parameters given a priori. These three terms in (2) measure how well the approximate solution satisfies the PDE, the initial condition and the boundary condition, respectively.
In the absence of temporal derivatives, (1) reduces to
and the corresponding loss function in DGM becomes
(3) 
Table 1 lists four PDEs with their corresponding loss functions in DGM and Table 2 lists different boundary conditions, the initial condition and their contributions to loss functions in DGM and MIM. More boundary conditions can be treated in this way. Interested readers may refer to Chen et al. (2020) for details.
Equation  Explicit form  Loss function 

Poisson  
MongeAmpére  
Biharmonic  
KdV 
Condition  Explicit form  Contribution to the loss function 

Dirichlet  
Neumann  or  
Initial 
In MIM, we first rewrite highorder derivatives into loworder ones using auxiliary variables. For notational convenience, auxiliary variables represent
(4)  
For KdV equation, we have instead of the second formula in (4). With these auxiliary variables, we define loss functions for four types of PDEs in Table 3. Since one can choose a subset of highorder derivatives into the set of unknown functions, there are more than one loss function in MIM. For biharmonic equation, there are two commonly used sets of auxiliary variables in local discontinuous Galerkin method and weak Galerkin finite element method: one with all highorder derivatives Yan and Shu (2002) and the other with part of highorder derivatives Cockburn et al. (2009); Mu et al. (2015). Correspondingly, if all highorder derivatives are used, we denote MIM by MIM, and if only part of highorder derivatives are used, we denote MIM by MIM. In Section 2.2, we will discuss how to equip different loss functions with different DNNs. In short, if only one DNN is used to approximate the PDE solution and its derivatives, we denote MIM by MIM, and if multiple DNNs are used, we denote MIM by MIM. In Section 3, different loss functions listed in Table 1, Table 2 and Table 3 will be tested and discussed. By default, all the penalty parameters are set to be .
Equation  Explicit form  Loss function 
Poisson  
MongeAmpére  
Biharmonic  
KdV  
2.2 Neural network architecture
ResNet He et al. (2015) is used to approximate the PDE solution and its highorder derivatives. It consists of blocks in the following form
(5) 
Here , . is the depth of network, is the width of network, and
is the (scalar) activation function. Explicit formulas of activation functions used in this work are given in Table
4. The last term on the righthand side of (5) is called the shortcut connection or residual connection. Each block has two linear transforms, two activation functions, and one shortcut; see Figure
1 for demonstration. Such a structure can automatically solve the notorious problem of vanishing/exploding gradient He et al. (2016).Activation function  Formula 

Square  
ReLU  
ReQU  
ReCU 
Since is in rather than
, we can pad
by a zero vector to get the network input
. A linear transform can be used as well without much difference. Meanwhile, has outputs which cannot be directly used for the PDE solution and its derivatives employed in the loss function. Therefore, a linear transform is applied to to transform it into a suitable dimension. Let be the whole set of parameters which include parameters in ResNet () and parameters in the linear transform . Note that the output dimension in MIM depends on both the PDE problem and the mixed residual loss. We illustrate network structures for biharmonic equation as an example in Figure 2.From Figure 2, we see that DGM has only output, MIM has outputs, and MIM has outputs. In Figure 3, we illustrate networks structures of MIM and MIM for Poisson equation. In MIM, two DNNs are used: one to approximate the solution and the other one to approximate its derivatives. It is clear from Figure 2 that network structures in DGM and MIM only differ in the output layer and thus they have comparable numbers of parameters to be optimized. To be precise, we calculate their numbers of parameters in Table 5, from which one can see the number of parameters in DGM and MIM is close. The number of parameters in MIM is nearly double for Poisson equation, MongeAmpére equation and biharmonic equation (MIM), tripled for KdV equation, and quadrupled for biharmonic equation (MIM), respectively. In Section 3, from numerical results, we observe a better performance of MIM for all four equations, not only for derivatives of the PDE solution, but also for the solution itself.
Method  Equation  Size of the parameter set 
DGM  Four equations  
MIM  Poisson  
MongeAmpére  
Biharmonic (MIM)  
Biharmonic (MIM)  
KdV  
MIM  Poisson  
MongeAmpére  
Biharmonic (MIM)  
Biharmonic (MIM)  
KdV 
2.3 Stochastic Gradient Descent
For completeness, we also briefly introduce stochastic gradient descent method. For the loss function defined in (3
), we generate two sets of points uniformly distributed over
and : in and on .(6) 
where is the learning rate chosen to be here. and are measures of and , respectively. is the DNN approximation of PDE solution parameterized by . Sampling points and are updated at each iteration. In implementation, we use ADAM optimizer Kingma and Ba (2014) and automatic differentiation Paszke et al. (2017)
for derivatives in PyTorch.
3 Numerical Result
In this section, we show numerical results of MIM for four types of equations. We use relative errors of , , , and defined in Table 6 for comparison. In all figures, relative errors are in scale.
Quantity  DGM  MIM 

3.1 Poisson Equation
Consider the following Neumann problem
(7) 
with the exact solution . The neural network structure in DGM is the same as that for biharmonic equation shown in Figure 2. Following Table 1 and Table 2, we use the loss function for (7)
(8) 
Since both and are explicitly used, one more advantage of MIM is the enforcement of boundary conditions. For (7), we multiply by to satisfy the Neumann boundary condition automatically; see Figure 3. DGM only has as its unknown function, and thus it is unclear that how the exact Neumann boundary condition can be imposed.
Therefore, for DNNs in Figure 3, the loss function in MIM can be simplified as
(9) 
We emphasize that Dirichlet boundary condition can be exactly imposed in DGM Berg and Nyström (2018) and no penalty term is needed. For Neumann boundary condition, mixed boundary condition, and Robin boundary condition, however, it is difficult to build up a DNN representation which satisfies the exact boundary condition. Building up a DNN approximation which satisfies the exact boundary condition can have a couple of advantages Chen et al. (2020): 1) make ease of the training process by avoiding unnecessary divergence; 2) improve the approximation accuracy; 3) save the execution time. In MIM, however, we have the direct access to both and . Therefore, all these boundary conditions can be imposed exactly in principle. This will be presented in a subsequent work Lyu et al. (2020).
For (7), average errors of and over the last iterations are recorded in Table 7. The network depth and the activation function is used. Network widths are for dimensional problems, respectively. Time is recorded as the average CPU time per iteration. It is not surprising that MIM costs less time than DGM since the DNN approximation in MIM satisfies the Neumann boundary condition automatically and both methods have similar network structures. It is surprising that MIM costs less time than DGM since the number of parameters in MIM is about twice of that in DGM. In terms of execution time, MIM MIM DGM.
d  Method  Relative error ()  Time (s)  

2  DGM  0.3676  0.3714  0.04374 
MIM  0.2941  0.1639  0.02925  
MIM  0.0565  0.0236  0.03514  
4  DGM  1.0022  1.3272  0.07455 
MIM  0.3751  0.3290  0.03603  
MIM  0.2294  0.0690  0.04141  
8  DGM  2.0022  2.6551  0.13081 
MIM  0.9049  0.6423  0.06642  
MIM  0.7261  0.1499  0.08716  
16  DGM  3.9796  5.0803  0.25621 
MIM  1.7631  1.0041  0.11082  
MIM  0.0787  0.0236  0.15125 
Figure 4 and Figure 5 plot training processes of DGM and MIM in terms of relative errors for and . Generally speaking, in terms of approximation error, MIM MIM DGM as expected. Therefore, MIM provides a better strategy over DGM. MIM provides better approximations in terms of relative errors for both and . For , the improvement of MIM over DGM is about several times and that of MIM over MIM is about one order of magnitude. For , the improvement is about several times. Moreover, a dimensional dependence is observed for both and . The higher the dimension is, the better the approximation is.
Table 8 records approximation errors of MIM and DGM in terms of activation function and network depth when . MIM provides better approximations for both and . It is not surprising that ReLU is not a suitable function for DGM due to highorder derivatives, but is suitable in MIM since only firstorder derivatives are present in MIM.
Relative error ()  

DGM  MIM  MIM  
ReLU  1  0.9197  0.9259  0.0890  0.0444  0.0264  0.0080 
2  0.9210  0.9230  0.0245  0.0104  0.0265  0.0068  
3  0.9208  0.9216  0.0258  0.0113  0.0258  0.0084  
ReQU  1  0.0684  0.1003  0.0182  0.0127  0.0107  0.0042 
2  0.0057  0.0118  0.0113  0.0047  0.0049  0.0017  
3  0.0124  0.0140  0.0040  0.0029  0.0042  0.0031  
ReCU  1  0.4642  0.4644  0.0288  0.0159  0.0100  0.0033 
2  0.0281  0.0170  0.0071  0.0055  0.0048  0.0013  
3  0.0028  0.0031  0.0049  0.0036  0.0049  0.0013 
3.2 MongeAmpére equation
Consider the nonlinear MongeAmpére equation
(10) 
with the exact solution defined as . Following Table 1, 3 and 2, we have the loss function in DGM
and the loss function in MIM
respectively. For (10), the Dirichlet boundary condition can be enforced for both DGM and MIM. For comparison purpose, instead, we have the penalty term in both DGM and MIM. However, imposing exact boundary conditions is always encouraged in practice.
In this example, we fix the network depth and the activation function as . Relative errors in the last iterations with respect to the network width in different dimensions are recorded in Table 9. Figure 6 plots errors in terms of network width for different dimensions. The advantage of MIM is obvious from these results.
d  Relative error ()  

DGM  MIM  MIM  
2  10  0.1236  0.7430  0.1023  0.3433  0.1251  0.5218 
20  1.1100  3.1940  0.0922  0.3804  0.0784  0.0221  
30  0.0913  0.5656  0.0522  0.1740  0.1075  0.0219  
4  20  0.0981  0.7764  0.1095  0.6359  0.1230  0.3977 
30  0.0921  0.7731  0.0903  0.4399  0.1063  0.2802  
40  0.0943  0.6174  0.0636  0.3127  0.1287  0.2480  
8  30  0.3584  3.3902  0.1435  1.6318  0.1155  0.5170 
40  0.1179  1.4663  0.1344  1.0721  0.1330  0.4873  
50  0.0997  1.2483  0.0977  0.8289  0.0917  0.4174 
3.3 Biharmonic equation
Consider the biharmonic equation
(11) 
with the exact solution over . The loss function in DGM is
The loss function in MIM is
(12) 
and the loss function in MIM is
(13) 
Again, we can enforce the exact boundary condition in MIM but cannot enforce it in DGM. For comparison purpose, we use penalty terms in both methods.
Set and when , respectively. Table 10 records averaged errors in the last 1000 iterations.
d  Method  Relative error ( )  Time (s)  

2  DGM  0.1656  0.6454  1.2333  8.8001  0.1034 
MIM  0.1501  0.1929  0.1564  0.3067  0.1219  
MIM  0.0769  0.1155  0.1504  0.4984  0.1636  
MIM  0.0526  0.2066  0.2937  1.6821  0.1393  
MIM  0.0424  0.1417  0.3625  2.2231  0.2164  
4  DGM  0.1330  0.6454  1.2333  8.8008  0.3292 
MIM  0.4117  0.1929  0.1563  0.3066  0.2784  
MIM  0.0845  0.1155  0.1504  0.4984  0.4692  
MIM  0.1039  0.2066  0.2937  1.6821  0.2883  
MIM  0.1111  0.1417  0.3625  2.2301  0.5919  
8  DGM  0.2488  1.0514  1.4594  13.4003  0.3292 
MIM  0.3719  2.3855  0.6797  3.1015  0.2784  
MIM  0.1856  0.6909  0.7840  4.7209  0.4692  
MIM  0.1475  1.6657  1.2922  6.9594  0.8051  
MIM  0.2881  0.9223  0.9981  6.4658  6.5148 
Relative errors for , , and in terms of iteration number are plotted in Figure 7 when .
Generally speaking, MIM provides better approximations for , , , and than DGM. For MIM and MIM, MIM has a slightly better approximation accuracy comparable to that of MIM, although MIM has more outputs. These results are of interests since they are connected with results of local discontinuous Galerkin method that the formulation with a subset of derivatives has a better numerical performance Yan and Shu (2002); Cockburn et al. (2009). We point out that MIM has the advantage that the exact boundary condition can be enforced, although we use penalty terms for this example.
3.4 KdV equation
Consider a timedependent linear KdVtype equation
(14) 
defined over , where the exact solution . We first rewrite it into the firstorder system
The loss function in DGM is
Here is the standard basis set of . The loss function in MIM is
Relative errors of , , and are recorded in Table 11. Again, as shown in previous examples, MIM provides better results compared to DGM, especially for ReQU activation function. No obvious improvement of MIM over MIM is observed.
Method  Relative error ()  

1  ReQU  DGM  34.9171  20.6788  34.3661 
MIM  0.5705  5.3709  0.5369  
MIM  1.2920  0.8129  1.9244  
ReCU  DGM  0.7603  0.4785  0.5977  
MIM  0.0991  0.7313  0.0128  
MIM  0.5035  0.5804  0.1229  
2  ReQU  DGM  84.8708  85.8114  85.8954 
MIM  2.9393  1.9996  2.9443  
MIM  2.1820  2.5591  2.1383  
ReCU  DGM  2.5483  2.1856  2.4431  
MIM  1.5410  2.3865  1.5645  
MIM  5.5900  5.7440  5.8957  
3  ReQU  DGM  168.1755  168.1697  169.3528 
MIM  4.0421  4.0987  3.8496  
MIM  7.7027  8.8787  9.1058  
ReCU  DGM  1.9132  1.4846  1.7970  
MIM  1.5410  2.3865  1.5645  
MIM  5.5900  5.7440  5.8957 
4 Conclusion and Discussion
Motivated by classical numerical methods such as local discontinuous Galerkin method, mixed finite element method, and leastsquares finite element method, we develop a deep mixed residual method to solve highorder PDEs in this paper. The deep mixed residual method inherits several advantages of classical numerical methods:

Flexibility for the choice of loss function;

Larger solution space with flexible choice of deep neural networks;

Enforcement of exact boundary conditions;

Better approximations of highorder derivations with almost the same cost.
Meanwhile, the deep mixed residual method also provides a better approximation for the PDE solution itself. These features make deep mixed residual method suitable for solving highorder PDEs in high dimensions.
Boundary condition is another issue which is important for solving PDEs by DNNs. Enforcement of exact boundary conditions not only makes the training process easier, but also improves the approximation accuracy; see Berg and Nyström (2018); Chen et al. (2020) for examples. The deep mixed residual method has the potential for imposing exact boundary conditions such as Neumann boundary condition, mixed boundary condition, and Robin boundary condition. All these conditions cannot be enforced exactly in deep Galerkin method. This shall be investigated in a subsequent work Lyu et al. (2020).
So far, in the deep mixed residual method, only experiences from classical numerical methods at the basic level are transferred into deep learning. We have seen its obvious advantages. To further improve the deep mixed residual method, we need to transfer our experiences from classical numerical analysis at a deeper level. For example, the choice of solution space relies heavily on the choice of residual in order to maximize the performance of leastsquares finite element method Bochev and Gunzburger (2015). Many other connections exist in discontinuous Galerkin method Cockburn et al. (2000) and mixed finite element method Boffi et al. (2013). For examples, since only firstorder derivatives appear in the deep mixed residual method, ReLU works well for all timeindependent equations we have tested but does not work well for KdV equation. Therefore, it deserves a theoretical understanding of the proposed method in the language of linear finite element method He et al. (2018). Another possible connection is to use the weak formulation of the mixed residual instead of leastsquares loss, as done in deep learning by Zang et al. (2020) and in discontinuous Galerkin method by Cockburn et al. (2000). Realizing these connections in the deep mixed residual method will allow for a systematic way to understand and improve deep learning for solving PDEs.
5 Acknowledgments
This work was supported by National Key R&D Program of China (No. 2018YFB0204404) and National Natural Science Foundation of China via grant 11971021. We thank Qifeng Liao and Xiang Zhou for helpful discussions.
References
 Optimal control and viscosity solutions of HamiltonJacobiBellman equations. Springer Science & Business Media. Cited by: §1.
 Machine learning approximation algorithms for highdimensional fully nonlinear partial differential equations and secondorder backward stochastic differential equations. Journal of Nonlinear Science 29 (4), pp. 1563–1619. Cited by: §1.
 Overcoming the curse of dimensionality in the numerical approximation of highdimensional semilinear elliptic partial differential equations. arXiv preprint arXiv:2003.00596. External Links: 2003.00596, Link Cited by: §1.
 A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317, pp. 28–41. External Links: ISSN 09252312, Link, Document Cited by: §3.1, §4.
 Least Squares Finite Element Methods. Springer, Berlin, Heidelberg. External Links: Document Cited by: item 1, §4.
 Mixed Finite Element Methods and Applications. Springer, Berlin, Heidelberg. External Links: ISSN 01793632, ISBN 9783642365188, Document Cited by: §1, §4.
 Solution of the blackscholes equation using artificial neural networks. Journal of Physics: Conference Series 1221, pp. 012044. Cited by: §1.
 A comprehensive study of boundary conditions when solving PDEs by DNNs. arXiv preprint arXiv:2005.04554. External Links: 2005.04554, Link Cited by: §2.1, §3.1, §4.
 A hybridizable and superconvergent discontinuous galerkin method for biharmonic problems. Journal of Scientific Computing 40 (1), pp. 141–187. Cited by: item 1, §2.1, §3.3.
 Discontinuous Galerkin Methods  Theory, Computation and Applications. SpringerVerlag Berlin Heidelberg. External Links: ISBN 9783642640988, Document Cited by: §1, §1, §4.
 The principles of quantum mechanics. Oxford university press. Cited by: §1.
 Controlling oscillations in highorder discontinuous galerkin schemes using artificial viscosity tuned by neural networks. Journal of Computational Physics 409, pp. 109304. Cited by: §1.
 Deep learningbased numerical methods for highdimensional parabolic partial differential equations and backward stochastic differential equations. Communications in Mathematics and Statistics 5 (4), pp. 349–380. Cited by: §1.
 The Deep Ritz Method: A Deep LearningBased Numerical Algorithm for Solving Variational Problems. Communications in Mathematics and Statistics 6 (1), pp. 1–12. External Links: ISSN 2194671X, Document Cited by: §1, §1.
 Finite Elements and Fast Iterative Solvers: with Applications in Incompressible Fluid Dynamics. Oxford University Press. External Links: ISBN 9780191780745, Document Cited by: §1.
 A multiscale neural network based on hierarchical matrices. Multiscale Modeling & Simulation 17 (4), pp. 1189–1213. Cited by: §1.
 Solving the quantum manybody problem with artificial neural networks. Science 355 (6325), pp. 602–606. Cited by: §1.
 Solving highdimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences of the United States of America 115 (34), pp. 8505–8510. Cited by: §1.
 Solving manyelectron schrödinger equation using deep neural networks. Journal of Computational Physics 399, pp. 108929. Cited by: §1.
 ReLU deep neural networks and linear finite elements. arXiv preprint arXiv:1807.03973. Cited by: §4.
 Deep residual learning for image recognition. CoRR 1512.03385. External Links: Link, 1512.03385 Cited by: §1, §2.2.

Deep residual learning for image recognition.
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
2, pp. 770–778. Cited by: §2.2.  Options, futures and other derivatives. Upper Saddle River, NJ: Prentice Hall,. Cited by: §1.
 A proof that rectified deep neural networks overcome the curse of dimensionality in the numerical approximation of semilinear heat equations. arXiv preprint arXiv:1901.10854. External Links: 1901.10854, Link Cited by: §1.
 Solving for highdimensional committor functions using artificial neural networks. Research in the Mathematical Sciences 6, pp. 1. Cited by: §1.
 Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. Cited by: §2.3.
 Finite Difference Methods for Ordinary and Partial Differential Equations: SteadyState and TimeDependent Problems. Society for Industrial and Applied Mathematics. Cited by: §1.
 D3M: A Deep Domain Decomposition Method for Partial Differential Equations. IEEE Access 8, pp. 5283–5294. External Links: ISSN 21693536, Document Cited by: §1.
 Enforcing exact boundary and initial condtions in the deep mixed residual method. in preparation. Cited by: §3.1, §4.
 A weak Galerkin finite element method with polynomial reduction. Journal of Computational and Applied Mathematics 285, pp. 45–58. External Links: ISSN 03770427, Document Cited by: §2.1.
 Automatic differentiation in PyTorch. Note: [Online; accessed 13. May 2020] External Links: Link Cited by: §2.3.
 Physicsinformed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics 378, pp. 686–707. Cited by: §1.
 Deep hidden physics models: deep learning of nonlinear partial differential equations. Journal of Machine Learning Research 19 (1), pp. 932–955. Cited by: §1.
 Spectral methods: algorithms, analysis and applications. Vol. 41, Springer Science & Business Media. Cited by: §1.
 DGM: a deep learning algorithm for solving partial differential equations. Journal of Computational Physics 375, pp. 1339–1364. Cited by: §1, §1.
 Deep multiscale model learning. Journal of Computational Physics 406, pp. 109071–109071. Cited by: §1.
 Local Discontinuous Galerkin Methods for Partial Differential Equations with Higher Order Derivatives. Journal of Scientific Computing 17 (1), pp. 27–47. External Links: ISSN 15737691, Document Cited by: item 1, §2.1, §3.3.
 Weak adversarial networks for highdimensional partial differential equations. Journal of Computational Physics 411, pp. 109409. Cited by: §1, §4.
 Physicsconstrained deep learning for highdimensional surrogate modeling and uncertainty quantification without labeled data. Journal of Computational Physics 394, pp. 56–81. External Links: ISSN 00219991, Document Cited by: §1.
Comments
There are no comments yet.