1 Introduction
Traditional methods for solving systems of differential equations imply a numerical stepbystep integration of the system. For some problems, this integration leads to timeconsuming algorithms because of the limitations on the time interval that is used to achieve the necessary accuracy of the solution. From this perspective, neural networks as universal function approximation can be applied for the construction of the solution in a more performance way.
In the article [1], the method to solve initial and boundary value problems using feedforward neural networks is proposed. The solution of the differential equation is written as a sum of two parts. The first part satisfies the initial/boundary conditions. The second part corresponds to a neural network output. The same technique is applied for solving Stokes problem in [2, 3] and implemented in code in [4].
In the article [5], the neural network is trained to satisfy the differential operator, initial condition, and boundary conditions for the partial differential equation (PDE). The authors in [6]
translate a PDE to a stochastic control problem and use deep reinforcement learning for an approximation of derivative of the solution with respect to the space coordinate.
Other approaches rely on the implementation of a traditional stepbystep integrating method in a neural network basis [7, 8]. In the article [8]
, the author proposes such an architecture. After fitting, the neural network produces an optimal finite difference scheme for a specific system. The backpropagation technique through an ordinary differential equation (ODE) solver is proposed in
[9]. The authors construct a certain type of neural network that is analogous to a discretized differential equation. This group of methods requires a traditional numerical method to simulate dynamics.Polynomial neural networks are also widely presented in the literature [10, 11, 12]. In the article [10]
, the polynomial architecture that approximates differential equations is proposed. The Legendre polynomial is chosen as a basis function of hidden neurons in
[11]. In these articles, the polynomial architectures are used as black box models, and the authors do not explain its connection to the theory of ODEs.In all the described approaches, the neural networks are trained to consider the initial conditions of the differential equations. This means that the neural network should be trained each time when the initial conditions are changed. The abovedescribed techniques are applicable to the general form of differential equations but are able to provide only a particular solution of the system.
In the article, we consider polynomial differential equations. Such nonlinear systems arise in different fields such as automated control, robotics, mechanical and biological systems, chemical reactions, drug development, molecular dynamics, and so on. Moreover, often it is possible to transform a nonlinear equation to a polynomial view with some level of accuracy.
For polynomial differential equations, it is possible to build a polynomial neural network that is based on the matrix Lie transform and approximates the general solution of the system of equations. Having a Lie transform–based neural network for such a system, dynamics for different initial conditions can be estimated without refitting of the neural network. Additionally, we completely avoid numerical ODE solvers in both simulation and datadriven system learning by describing the dynamics with maps instead of stepbystep integrating.
2 Proposed Neural Network
The proposed architecture is a neural network representation of a Lie propagator for dynamical systems integration that is introduced in [13] and is commonly used in the charged particle dynamics simulation [13, 14]. We consider dynamical systems that can be described by nonlinear ordinary differential equations,
(1) 
where is an independent variable,
is a state vector, and
means th Kronecker power of vector . There is an assumption that function can be expanded in Taylor series with respect to the components of .The solution of (1) in its convergence region can be presented in the series [15, 16],
(2) 
where In [14], it is shown how to calculate matrices by introducing new matrices . The main idea is replacing (1) by the equation
(3) 
This equation should be solved with initial condition , where
is the identity matrix. Theoretical estimations of accuracy and convergence of the truncated series in solving of ODEs can be found in
[17].The transformation can be considered as a discrete approximation of the evolution operator of (1) for initial time and interval . This means that the evolution of the state vector during time can be approximately calculated as . Hence, instead of solving the system of ODEs numerically, one can apply a calculated map and avoid a stepbystep integrating.
2.1 Neural Network Representation of Matrix Lie Transform
The proposed neural network implements map in form of
(4) 
where , are weight matrices, and means th the Kronecker power of vector . For a given system of ODEs (1), one can compute matrices in accordance with (3) up to the necessary order of nonlinearity.
Fig. 1 presents a neural network for map (4) up to the third order of nonlinearities for a twodimensional state. In each layer, the input vector is consequently transformed into and , where weighted sum is applied. The output Y equals to the sum of results from every layer. In the example, we reduce Kronecker powers for decreasing of weights matrices dimension (e.g., ).
2.2 Fitting Neural Network
If the differential equation is provided, the training of neural network is not necessary. The weights of the network can be calculated directly from the equation following the relation (3). On the other hand, for datadriven system learning, the weights in form of (4) can be fitted without any assumptions on view of differential equations.
To fit a proposed neural network, the training data is presented as a multivariate time series (table 1) that describes the evolution of the state vector of the dynamical system in a discrete time. In a general case, each step should be described as map , but if the system (1) is time independent, then weights depends only on time interval .
INPUT  OUTPUT 
3 Ordinary Differential Equations
In this section, we consider the Van der Pol oscillator. The equation is widely used in the physical sciences and engineering and can be used for the description of the pneumatic hammer, steam engine, periodic occurrence of epidemics, economic crises, depressions, and heartbeat. The equation has wellstudied dynamics and is widely used for testing of numerical methods (e.g., [18]).
3.1 Simulation of the Van der Pol Oscillator
The Van der Pol oscillator is defined as the system of ODEs that can be presented in the form of
(5)  
The results of numerical integration of the system with the implicit Adams method of eighth order with the maximum time step are presented in Fig. 2 with red lines. The four different particular solutions with initial conditions, , and , were calculated.
Another method for simulating the dynamics is mapping approach. The weights of the matrix Lie map can be calculated up to the necessary order of nonlinearity based on the equation (3). For instance, for the third order and the same time interval, it yields weight matrices
The corresponding polynomial neural network implements transformation
The results of the numerical integration of the system with the neural network are presented in Fig. 2 with blue dots. Note that for the matrix Lie maps, the accuracy of the truncation of the series (order of nonlinearity of the transformation) and the accuracy of weights calculation should be considered separately. The theory of the accuracy and convergence of the truncated series (2) in solving ODEs can be found in [13, 17].
From a practical perspective, the accuracy of the simulation provided by a polynomial neural network can be estimated with respect to the traditional numerical solver. For example, the mean relative errors between the predictions of the Lie map–based networks of the third, fifth, and seventh orders of nonlinearity with respect to the numerical solution calculated with the Adams method of eighth order are equal to , , and , respectively.
3.2 Learning of the Van der Pol Oscillator
In the previous section, we described how weights for the proposed polynomial neural network can be calculated based on the equation. On the other hand, when the equation is not known, but a particular solution is provided, the weights can be fitted by a neural network without any assumptions in view of differential equations.
A particular solution of the system with the initial condition can be generated by numerically integrating system (5) with time step during time
. Having this training data set, the proposed neural network can be fitted with the mean squared error (MSE) as a loss function based on the norm
We implemented the abovedescribed technique in Keras/TensorFlow and fitted a thirdorder Lie transform–based neural network with an Adamax optimizer.
The generalization property of the network can be investigated by examining prediction not only from the training data set but also for new initial conditions. Fig. 3 (a) shows the training set as a particular solution of the system with initial conditions . Fig. 3 (b) demonstrates predictions that are calculated starting at both the same initial condition and for the new points , and . For the prediction starting from the training initial condition, the mean relative error of the predictions is . For the new initial conditions, the mean error is .
4 Partial Differential Equations
Burgers’ equation is a fundamental partial differential equation that occurs in various areas, such as fluid mechanics, nonlinear acoustics, gas dynamics, and traffic flow. This equation is also often used as a benchmark for numerical methods. For example, one of the problems proposed in the Airbus Quantum Computing Challenge [19] is building a neural network that solves Burgers’ equation with at least the same level of accuracy and higher computational performance as the traditional numerical methods. In the article [20], a feedforward neural network is trained to satisfy Burgers’ equation and certain initial conditions, but the computational performance of the approach is not estimated. In this section, we demonstrate how to build a Lie transform–based neural network that solves Burgers’ equation.
4.1 The Finite Difference Method for Burgers’ Equation
Burgers’ equation has a form
(6) 
Following the [19] for benchmarking, we use an analytic solution
and a traditional numerical method
(7) 
where stands for the time step, and stands for the grid node.
The equation (7) presents a finite difference method (FDM) that consists of an Euler explicit time discretization scheme for the temporal derivatives, an upwind firstorder scheme for the nonlinear term, and finally a centered secondorder scheme for the diffusion term. The time step for benchmarking is fixed to with the uniform spacing of . Thus, for the numerical solution for times from to 5 on , the method requires the mesh with 1000 steps on space coordinate and 2000 time steps.
It is indicated in [19] that the FDM introduces a dispersion error in the solution (see Fig. 4, a). Such error can be reduced by increasing the mesh resolution, but then the time step should be decreased to respect the stability constraints of the numerical scheme.
4.2 Lie Transform–Based Neural Network
Though there are Lie group methods that directly apply Lie theory to PDEs [21, 22], we utilize a different approach. We convert the equation (6) to a system of ODEs and build a matrix Lie map in accordance with Section 2 for this new system.
Assuming that the righthand side of the equation (6) can be approximated by a function and considering this approximated equation as a hyperbolic one, it is possible to derive the system of ODEs
(8) 
where , , and is vector of discrete stamps on space. This transformation from PDE to ODE is well known and can be derived using the method of characteristics and direct method [23]. If is the same discretization as in (7), then the equation (8) leads to the system of 2000 ODEs
which can be easily expanded to the Taylor series with respect to the and up to the necessary order of nonlinearity.
Using this system of ODEs, we have built a Lie transform–based neural network for a time interval . This time step is five times larger than that used in the benchmarking (see Fig. 5). The numerical solution provided by the neural network is presented in Fig. 4 (b), and the accuracy and performance are compared in Table 2.
Method  Time  Mesh  Elapsed  MSE for 
step  size  time  
FDM  0.055 sec  
Lie transform–based  
neural network  0.016 sec 
The built polynomial neural network provides better accuracy with less computational time. If the FDM scheme is adjusted to a higher accuracy, the computational time will be increased even more. Accuracy is calculated as the MSE metric between the numerical solution and its analytic form at final time .
5 Code
The implementation of the Lie transform–based neural network in Keras/TensorFlow and the algorithm for map building for autonomous systems are provided at the GitHub repository: https://github.com/andiva/DeepLieNet.
The notebook
https://github.com/andiva/DeepLieNet/tree/master/demo/VanderPol.ipynb
corresponds to Section 3 and consists of simulation, the definition of metrics for accuracy estimation, neural network configuration, and fitting. The notebook
https://github.com/andiva/DeepLieNet/tree/master/demo/Burgers.ipynb
reproduces the results presented in Section 4.
6 Conclusion
In the article, we demonstrate the solving of differential equations with polynomial neural networks that are based on matrix Lie maps. Since the weights of the proposed neural network can be directly calculated from the equations, it does not require fitting with respect to the initial condition. Built at once, the neural network can be considered as a model of the system and can be used for simulation with different initial conditions.
In the case of large time steps for map calculating, the proposed approach can significantly outperform traditional numerical methods. For Burgers’ equation, the computational performance is increased several times with the same level of accuracy. For some problems in the charged particle dynamics simulation, the performance is increased a thousand times with an appropriate accuracy in comparison with the traditional stepbystep integrating [24, 25].
The proposed neural network can be used for datadriven identification of the systems. It may provide a high level of generalization when learning dynamical systems from data. As shown with the Van der Pol oscillator, learning the dynamics of the system with only a particular solution is possible. The neural network presented in Section 4 can be additionally fitted to satisfy the initial conditions. In this sense, the training will provide an optimal numerical approach with respect to certain initial conditions.
The limitations of the datadriven approach for largescale systems, optimal network configuration, and noisy data consideration should be examined in further research.
References
 [1] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. Tech. rep. (1997), https://arxiv.org/pdf/physics/9705023.pdf, last accessed 2019/03/03.
 [2] Baymani, M., Kerayechian, A., Effati, S.: Artificial Neural Networks Approach for Solving Stokes Problem. Applied Mathematics, 1, 288–292 (2010).
 [3] Chiaramonte, M., Kiener, M.: Solving differential equations using neural networks (2013), http://cs229.stanford.edu/proj2013/ChiaramonteKienerSolvingDifferentialEquationsUsingNeuralNetworks.pdf, last accessed 2019/03/03.
 [4] Sharma, A.: NeuralNetDiffEq.jl: A Neural Network solver for ODEs, https://julialang.org/blog/2017/10/gsocNeuralNetDiffEq, last accessed 2019/03/03.

[5]
Sirignano, J., Spiliopoulos, K.: DGM: A deep learning algorithm for solving partial differential equations. Journal of Computational Physics (2018).
 [6] Weinan, E., Han, J., Jentzen, A.: Deep learningbased numerical methods for highdimensional parabolic partial differential equations and backward stochastic differential equations. Tech. rep. (2017), https://arxiv.org/pdf/1706.04702.pdf, last accessed 2019/03/03.
 [7] Anastassi, A.: Constructing RungeKutta Methods with the Use of Artificial Neural Networks. Tech. rep. (2013), https://arxiv.org/pdf/1106.1194.pdf, last accessed 2019/03/03.
 [8] Wang, Y., Lin, C.: RungeKutta neural network for identification of dynamical systems in high accuracy. IEEE Transactions on Neural Networks, vol. 9, no. 2, 294–307 (1998).
 [9] Chen, R., Rubanova, Y., Bettencourt, J., Duvenaud, D.: Neural ordinary differential equation, https://arxiv.org/pdf/1806.07366.pdf, last accessed 2019/03/03.

[10]
Zjavka, L.: Differential polynomial neural network. Journal of Artificial Intelligence, 4 (1), 89–99 (2011).
 [11] Yang, Y., Hou, M., Luo, J.: A novel improved extreme learning machine algorithm in solving ordinary differential equations by Legendre neural network methods. Advances in Difference Equations, 469 (2018).
 [12] Schetinin, V.: Polynomial neural networkslearnt toclassify EEG signals. Tech. rep. (1997), https://arxiv.org/ftp/cs/papers/0504/0504058.pdf, last accessed 2019/03/03.
 [13] Dragt, A.: Lie methods for nonlinear dynamics with applications to accelerator physics (2011), http://inspirehep.net/record/955313/files/TOC28Nov2011.pdf, last accessed 2019/03/03.
 [14] Andrianov, S.: A role of symbolic computations in beam physics. Computer Algebra in Sc, Comp., Lecture Notes in Computer Science, 6244, 19–30 (2010).
 [15] Andrianov, S.: Symbolic Computation of Approximate Symmetries for Ordinary Differential Equations. Mathematics and Computers in Simulation, vol. 57, N 35, 147–154 (2001).
 [16] Andrianov, S.: A matrix representation of the Lie transformation. In: Proceedings of the Abstracts of the International Congress on Computer Systems and Applied Mathematics, 14 (1993).
 [17] Andrianov S. The convergence and accuracy of the matrix formalism approximation. In: Proceedings of ICAP2012, Rostock, Germany, 93–95 (2012).
 [18] Pan, S., Duraisamy, K. Longtime predictive modeling of nonlinear dynamical systems using neural networks. Hindawi Complexity, 4801012 (2018).
 [19] Airbus Quantum Computing Challenge, https://www.airbus.com/innovation/techchallengesandcompetitions/airbusquantumcomputingchallenge.html, last accessed 2019/03/03.
 [20] Hayati, M., Karami, B.: Feedforward neural network for solving partial differential equations. Journal of Applied Sciences 7(19), 2812–2817 (2007).
 [21] Casas. F.: Solution of linear partial differential equations by Lie algebraic methods. Journal of Computational and Applied Mathematics, 76, 159170 (1996).
 [22] Oliveri, F.: Lie symmetries of differential equations: direct and inverse problems. Note di Matematica, 23, n. 2, 195–216 (2004/2005).
 [23] Evans, L.C.: Partial Differential Equations. Providence, R.I.: American Mathematical Society (2010).
 [24] Senichev, Y., Lehrach, A., Maier, R., Zyuzin, D., Berz, M., Makino, K., Andrianov, S., Ivanov, A.: Storage ring EDM simulation: methods and results. In: Proceedings of ICAP2012, Rostock, Germany, 99–103 (2012).
 [25] Senichev, Y., Ivanov, A., Lehrach, A., Maier, R., Zyuzin, D., Andrianov, S.: Spin tune parametric resonance investigation. In: Proceedings of the Particle Accelerator Conference, 3020–3022 (2014).
Comments
There are no comments yet.