1 Introduction
It is widely known that neural networks are universal approximators of functions (Cybenko1989). Lagaris1998
proposed to use this property of neural networks to solve partial differential equations. In doing so, they formed a trial function using neural network and ran an optimization procedure so that the differential equation be as valid as possible for the trial. In order to form the lossfunction, the authors manually computed highorder (corresponds to the degree of equation) derivatives of the network output w.r.t to its inputs and weights. Later,
Lagaris2000 extended their approach on domains with more complex boundaries. The main idea was to form a trial function using two neural networks. The first one solves the equation inside the domain while the second one binds the boundary conditions.After mentioned papers, the domain of research took a long pause. In recent years, however, following the development of convenient frameworks for automatic differentiation (e.g. TensorFlow and PyTorch), more and more papers (
Berg2018, Nabian2018, Aradi2018) attempt to solve differential equations using neural networks. To our understanding, all of the modern approaches can be traced back to a DeepGalerkinmethod introduced by Sirignano2018. Essentially, Sirignano2018 rediscovered the approach of Lagaris1998 and used the power of modern automatic differentiation to test the approach at a larger scale, using deep neural networks for solving tasks at high dimensions.In this paper, we wrap up the progress in the domain of solving PDEs with neural networks in a PyDEns pythonmodule. The module is available for deep learning community on GitHub for usage and further improvement. Our goal is to simplify the experimentation in the emerging area of research. In doing so, we rely upon several pillars. First of all, a user should have an opportunity to set up complex problems in several lines of clear code. In that way, PyDEns allows to set up a PDEproblem from a wide class, including but not limited to heat equation and wave equation. Secondly, it is important not to impose constraints on the choice of neural networkarchitecture. With PyDEns, a user can either (i) choose a network from a zoo of implemented architectures, including ResNet, DenseNet, MobileNet
and others or (ii) build a complex neural network from scratch in a line of code, using convolutions and fullyconnected layers. Lastly, a user has complete control over pointsampling scheme. One can form batches of training data using a large family of probabilistic distributions on the domain of PDE, e.g. truncated gaussian, uniform and exponential distributions or even mixtures of those three.
The rest of the paper is organized as follows: in the next section we briefly review DeepGalerkinmodel as it was introduced in the paper of Sirignano2018 and present our modifications to the algorithm. In section (3) we explain in detail how to set up a PyDEns Pythonmodel for solving a particular PDEproblem. In section (4) we give a few recommendation regarding (i) the choice of architecture, (ii) sampling schemes and (iii) batch size. We conclude by talking about further development of our approach and explaining how our work relates to oilgas industry.
2 Original Deep Galerkin and our modifications
2.1 Original Deep Galerkin
Given evolution PDE of the form
(1)  
(2)  
(3)  
Sirignano2018 seek to approximate the solution using neural network . In doing so, they minimize the following objective:
(4)  
In other words, the neural network is fitted to validate all components (1)(3) of PDE, both the equation itself and boundary/initial conditions. Importantly, each term of objective (4) is given by an integral along its respective domain w.r.t. some measure . Naturally, these integrals are rarely tractable. Hence, Sirignano2018
propose to replace the integrals on its samplecounterparts and use stochastic gradient descent. All in all, the optimization process looks as follows:
Fit to approximately solve the PDE.
Note that this algorithm allows only for approximate binding of initial and boundary conditions by means of incorporating additional terms into objective (4).
2.2 Problem description
With PyDEns one can solve almost any imaginable PDE. Still, the model focuses on evolution equations with time up to the second order:
(5) 
where is an arbitrary function, composed from wellknown operations including , , <<>>, <<>> as well as differential operators . The domain of the problem is given by dimensional rectangle :
To form a wellposed equation of evolution one also needs boundary conditions:
(6) 
and initial state along with evolution rate of the system^{1}^{1}1needed only if is present in the equation:
(7) 
(8) 
2.3 Introducing ansatz: binding boundary and initial conditions
When working on PDEproblems posed on rectangular domains, the outofthe box approach of PyDEnsmodule uses ansatz for exact binding of initial/boundary conditions^{2}^{2}2the original algorithm from Sirignano2018 can also be easily implemented:
(9) 
In other words, the solution to PDE is approximated by a transformation of a neural networkoutput rather than itself. Form (9) ensures exact binding of initial/boundaries conditions (6)(8) whenever the following holds:
In all, the modified algorithm of PyDEnsmodule looks as follows (2)  compare this to algorithm (1):
Fit
to approximately solve the PDE.
3 Problem setup and configuration
3.1 BatchFlow and configdictionaries
In order to overcome many obstacles related to the training of neural networks, PyDEns relies on BatchFlowframework. The main purpose of BatchFlow is to allow for creation of reproducible pipelines for deep learning. Most importantly, BatchFlow allows to set up a neural network model for training in a simple way, using configuration dictionary:
The goto approach of BatchFlow for creating a neural networkarchitecture from scratch is to use complex block conv_block. The block uses string layout for defining the network as a sequence of layers:
In the code section above layout "faR fa fa+ f" stands for fully connected network with 3 hidden layers, hyperbolic tangentactivations and one ResNetlike skip connection: symbols R and + stand for the start and the end of the connection respectively. In the same manner, PyDEns can be configured to solve a particular PDEproblem via pdekey of the configurationdict. The next section explains in detail how to correctly set up pdekey.
3.2 Configuring Pdekey
We start filling up the pdekey by specifying the dimensionality of the equation:
The next step is to define the equation itself using the key form. This key sets up differential operator of equation (5) as a Pythoncallable. To define the callable we use the language of mathematical tokens. The list of tokens includes names like and differentiationtoken . The first step is to add a set of tokens to the current namespace via ‘add tokens‘ function:
We can go on with defining the equation. For purposes of demonstration let us setup the equation
known as Poisson equation:
As you can see, the usage of tokens is rather straightforward. Note that token can be chained allowing for creation of higherorder derivatives. Default value for domain of the equation is unit dimensional rectangle . To change it, one can pass ’domain’ key in PDE setup dictionary:
To finish configuration of the PDEproblem, we must supply the boundary condition:
It is not difficult to define a more complex boundary condition:
3.3 Configuring the rest of the model
We go on with configuring our model and define a simple feedforward architecture with three hidden layers to solve the PDE at hand:
We can now assemble modelconfiguration dictionary using previously defined pde and bodysubconfigs and adding mseloss function:
Finally, we need to specify strategy of generating points from the domain. In this example we simply use uniform distribution over unit square in
:To learn more on how to create complex distributions, check out our https://github.com/analysiscenter/batchflow/blob/master/examples/tutorials/07_sampler.ipynb.
All that is left now is to train configured model in order to minimize errorfunction. For that we only need to wrap up the modelconfig in a modelinstance and run the fitting procedure:
Change of lossfunction (4) is illustrated on Figure 1. The fact that it achieves zero demonstrates that neural network approximates solution of the given PDE on desired domain. From graph of approximation, shown on Figure 2, we can easily verify that boundary conditions are satisfied.
In the next section we demonstrate how one can configure PyDEns to solve a more complex problem.
3.4 Heat equation
Lets turn our attention to heat equation of the form
To communicate this problem to the model, we use following syntax:
Since the problem at hand is not as simple as the previous one, we need more sophisticated neural network as approximator:
Letter ’R’ in the layout convention stands for beginning of residual connection, while plus sign for its ending with summation. Rest of the solving pipeline is pretty much the same: uniform sampling over unit square and batch size of 200 points are used. Results are demonstrated on Figures
3.4, 5.4 Choice of hyperparameters
In this section we discuss choice of important hyperparameters, namely, model architecture, number of points in every training batch, use of different sampling schemes and so on.
4.1 Model architecture
In previous examples we almost exclusively used fullyconnected layers and residual connections. For harder equation in hand, especially with fast changing solutions, it is recommended to use gated connections, such as ones introduced in hochreiter1997long, cho2014learning. Number of layers and their respective size depends mostly on the dimensionality of PDE.
4.2 Batch size
Amount of points in each training batch is of utmost importance. If it is not big enough, then the estimation of true gradient is too inaccurate and leads to poor performance. On the other hand, if there are enough points to cover big part of the domain, then each training step does not account for any regionspecific information, which also leads to low quality of the resulting model. The optimal middleground can be found by looking at graph of loss function during training: big oscillations usually speak for too low of a batch size, while unsatisfactory value of loss after plateauing can be viewed as a sign of too many points in the batch, provided that neural network is expressive enough.
4.3 Sampling schemes
Another way of communicating regionspecific information to the model is by carefully choosing sampling strategy. Analogous to classical methods of solving PDE’s, we can sample points near provided boundary or initial condition during the first few iterations of model fitting. In the same manner, we can use different samplers during all of the training time: each of them would concentrate on some sufficiently small region of the initial domain, so that the model is trained to work better in this exact part of . It translates to code in a straightforward manner:
In applications, we usually want to know solution only for narrow location. This is often the case for hydrodynamic modelling in oilgas. Subsequent usage of

initial sampler near boundary or initial conditions

domainwide sampler or combination of regionspecific samplers to achieve small value of loss on the domain as a whole

sampler, concentrated around region of interest
allows to finetune model to perform best at the desired location.
5 Conclusion
We have presented PyDEns opensource pythonmodule based on work of Sirignano2018. The framework allows to train neural networks to approximately solve secondorder PDE’s of evolution. In order to set up a PDEproblem one only has to (i) communicate the problem itself in a pythondictionary (ii) set up a neural networkarchitecture using easytocomprehend layouts and (iii) prepare a pointsampling scheme using algebra of samplers, combining base numpydistributions in mixtures and productdistributions. In all, the framework allows for more convenient experimentation in emerging domain of solving PDEs with neural networks.
In further work we plan to focus on (i) incorporating uncertainty into equationinputs (for instance, its coefficients) in spirit of Aradi2018 and (ii) making coefficients of the equation trainable. This is of special importance in relation to oilgas industry, as it sets a path for a drastically new approach for (i) predicting the evolution of oilgas fields under uncertain geological properties and (ii) solving filtration problem (or other problems of inverse modeling).
Comments
There are no comments yet.