Modeling and control of complex dynamical systems require techniques for consideration of nonlinearities and uncertainties. On the face of it, artificial neural networks could provide a suitable approach for learning dynamical systems. The applications cover different problems, such as solving ordinary differential equations (ODEs) [1, 2], signal processing , feedback control systems , modeling and identification , and others. In the article 
, the neural network is trained to satisfy the differential operator, initial and boundary conditions for the partial differential equation. The backpropagation technique through an ODE solver is proposed in. The comparison of the recent research on solving differential equation with neural networks can be found in .
In the article, we describe a neural architecture that differs from the described above techniques. Firstly, it is not necessary to train the proposed network for simulation purposes. If the differential equations are provided, the weights of the network can be directly computed. Secondly, we completely avoid numerical ODE solvers in both simulation and data-driven system learning by describing the dynamics with maps instead of numerical step-by-step integrating.
The proposed architecture is a neural network representation of a Lie propagator for dynamical systems integration that is introduced in  and is commonly used in the charged particle dynamics simulation. We consider dynamical systems that can be described by nonlinear ordinary differential equations,
where is an independent variable,
is a state vector, andmeans -th Kronecker power of vector . There is an assumption that function can be expanded in Taylor series with respect to the components of .
2 Proposed Neural Network
2.1 Matrix Form of Lie Transform
This equation should be solved with initial condition , where9, 11].
The transformation can be considered as a discrete approximation of the evolution operator of (1) for initial time and interval . This means that the evolution of the state vector during time can be approximately calculated as . Hence, instead of solving the system of ODEs numerically, one can apply a calculated map and avoid a step-by-step integrating.
2.2 Neural Network Representation of Matrix Lie Transform
The proposed neural network implements map in form of
Fig. 1 presents a neural network for map (4) up to the third order of nonlinearities for a two-dimensional state. In each layer, the input vector is consequently transformed into and , where weighted sum is applied. The output Y equals to the sum of results from every layer. In the example, we reduce Kronecker powers for decreasing of weights matrices dimension (e.g., ).
2.3 Fitting Neural Network
To fit a proposed neural network, the training data is presented as a multivariate time series (table 1) that describes the evolution of the state vector of the dynamical system in a discrete time. In a general case, each step should be described as map , but if the system (1) is time independent, then weights depends only on time interval .
In the article, we consider only autonomous systems of ODEs with constant discretization . This allows using Lie transformed–based neural network with the shared across time weight matrices . A time-dependent right-hand side in (1) can be considered by introducing a deeper network architecture.
On these assumptions, the input of the neural network is state , and the output is
. In all provided examples, the loss function is mean squared error and the Adamax algorithm is used for fitting.
3 Simulation of Dynamical Systems
In the section, we describe how matrix Lie map can replace numerical methods for solving well-known systems of ODEs. We consciously limit ourselves to a visual comparison of the approaches. The examples of the accuracy estimation can be found in .
3.1 Simple Models
To demonstrate the application of matrix Lie maps for dynamics simulation, we consider three simple dynamical systems. The Lotka–Volterra system is taken in form of that can be derived from the classical form by a change of variables. These equations are commonly used for the description of population dynamics in biological, social, and economic systems. The Van der Pol oscillator is defined as and can be used for the description of pneumatic hammer, steam engine, periodic occurrence of epidemics, economic crises, depressions, and heartbeat. The Henon–Heiles model is an example of systems where dynamical chaos arises. The system of differential equations can be described as , , , . The chaos theory has applications in meteorology, physics, environmental science, computer science, engineering, and philosophy.
In Figs. 2 and 3, results of the simulation of these systems are shown. Fig. 2 represents the phase space dynamics (in coordinates) of the Lotka–Volterra system and the Van der Pol oscillator. Fig. 3 shows a Poincaré map , which is calculated by integrating the initial state vector , and . On both figures, the red lines and dots correspond to the numerical integration using Runge–Kutta method of fourth order. The blue dots are for simulation by the matrix Lie transform of third order of nonlinearity.
3.2 Biochemical Reaction Simulation
In this example, we demonstrate results of simulation of a biochemical system that is described in  and represents the influence of the Raf kinase inhibitor protein (RKIP). In the article, the influence of RKIP is investigated via numerical analysis of nonlinear ordinary differential equations using the MATLAB ode45 function that is based on step-by-step integration. Instead of using a step-by-step numerical integration method, one can build a polynomial neural network and utilize it for system simulation.
The system of differential equation consists of 11 nonlinear equations that describe the biochemical network. We built a second-order Lie map for this system and used it for simulation with the initial condition from . The results of the simulation are shown in Fig. 4 and have a good coincidence with the ones presented in .
4 Learning Dynamical Systems from Data
In this section, we describe the examples of the application of Lie transform–based neural networks for data-driven identification of dynamical systems.
4.1 Epidemic Dynamics
For this example, we first generate data from the equations of the SIR epidemic model . The model consists of three compartments: for the number susceptible, for the number of infectious, and for the number recovered.
We consider a system with parameters , and on time interval [0; 10]. For the data generation, we use traditional Runge–Kutta methods of fourth order with time step . We define the training set as a particular solution of the system with initial condition , and . The two testing sets were generated from the system as the solutions start with new initial condition for test1, and for test 2. After data is generated, we do not use differential equation further.
To compare the proposed approach with traditional architectures, the LSTM and Lie transform–based neural networks have been fitted only with the training solution. Then the prediction for testing initial conditions that were not presented during fitting is examined. The neural networks configurations can be found following the link provided at the end of the article.
As shown in Fig. 5, the LSTM neural network just memorized training data. It tends to predict the same solution (training one) regardless of initial conditions, while a Lie transform–based neural network is able to correctly predict the dynamics for previously unseen initial conditions. Moreover, it preserves the physical properties of the system and can recognize the fixed point that corresponds to the absence of the epidemic.
4.2 iPad and iPhone Sales
In the article  the authors investigate the dynamics of iPhone and iPad sales with differential equations. They suggest analytic formulas for systems of nonlinear ODEs and fit parameters based on time-series data. This is a traditional approach for system identification. On the other hand, using the described above technique one can identify dynamics utilizing Lie transform–based neural network without knowledge of appropriate equations.
In this example, we generated data for iPad and iPhone sales from the plots presented in the article . Then we fitted a fifth order Lie transform–based network with this data. Note that in this case, we did not use any specific assumption on the possible view of equations as made in the original article. The order of nonlinearities is chosen based on the experimentation.
After fitting the neural network, one can receive a Lie map
where and are sales for iPhone and iPad, respectively, at time .
This section is devoted to the practical applications of the developed technique. Initially, we developed the proposed method for the high-performance simulation of charged particle dynamics. In this article, we briefly mention the key concepts of applying Lie map for modeling of the particle accelerators and storage rings. The second application corresponds to learning a production dynamics with only 10 data points and providing model interpretation. The example is taken from the cosmetic industry.
5.1 Charged Particle Accelerators
Charged particle accelerator consists of a number of physical equipment (e.g., quadrupoles, bending magnets, and others). The design of accelerators and nonlinear dynamics investigation require an accurate computer model of such a complicated system.
The particle dynamics in the physical control element can be described by a system of ODEs that has a complex nonlinear form. For instance, the equation of a particle motion depends on electromagnetic fields and has a 9-dimensional state vector for spin-orbit dynamics. For long-term dynamics investigation, the traditional step-by-step numerical methods are not suitable because of the performance limitation. Instead of solving differential equations directly, one can estimate a matrix Lie map for each control element in an accelerator. By combining such maps consequently, one can obtain a deep polynomial neural network that represents the whole accelerator ring (see Fig. 7).
5.2 Bath Bombs and Bath Fizzies
Consider a production of a chemical product that implies mixing of 11 source components such as baking soda, olive oil, SLES, water, and others. The known set of component rates in the amount of 10 data points (bath bombs) that leads to the stable characteristic of the product after one minute of mixing is known. Having this information, one needs to produce a new product (bath fizzies) that has a modified SLES component that is equal to zero (see table 2).
The first challenge in this problem is a limited dataset with only 10 points. The second one is the extrapolation issue. One has to build a model with data that contain SLES and provide predictions where SLES is not presented. This makes it almost impossible to use traditional machine learning methods.
At the same time, a Lie transform–based neural network allows building a model that can provide reasonable results. To achieve this, one has to make two assumptions. Firstly, we consider production as a continuous process that can potentially be described by a system of ODEs. We also assume that these equations are time-independent, allowing us to consider constant Lie map for each time step. Secondly, the initial components in bath bombs and bath fizzies are considered as different initial conditions of the dynamical system.
Under these assumptions, one can represent available data points as a dynamical process from to minute with discrete time step . The time series representation of first data point is shown in table 3, where is a state vector of the initial components ( is baking soda, , is SLES, is water, and is product stability), and is a Lie transform-based layers that represent the dynamics. NA means not available and implies hidden process states. Thus, for each of 10 samples, as an input for the neural network, we use the initial component rates. The output is the stability of the product.
Because of our assumptions, we can model the production process by a Lie transform–based neural network that consists of 199 consistent Lie maps with shared across time weight matrices (table 3). During fitting of the Lie map, the neural network recovers the dynamics and estimates data points that are marked as NA. After the neural network is fitted with bath bombs, one can use it to predict the dynamics of bath fizzies. The optimal water rates with modified SLES component can be found by varying input variable because of the preservation of output variable at the final mixing time .
The predictions provided by traditional machine learning method are presented in table 4
. Note that the results provided by traditional methods are expected by their design but are physically incorrect. Linear regression provides nonphysical negative rates. Decision tree predicts the same values as in training data. Support vector regression provides almost constant value close to the mean value of water in the training set. Only considering data points as initial conditions of a dynamical process provides physically explainable growth of necessary water in case of SLES absence. This result is also approved by chemical engineers.
|Lie Map NN||0.64||0.84||0.36|
There is also a possibility to translate found weight matrices of the neural network to the equations. To implement this, one has to find such formulas for the system of ODEs that provide the found weights after implementing the described in section 2.1 algorithm. For instance, we parameterized the right-hand side of ODEs up to the second order of nonlinearity and derive the following system,
which consists of polynomial right-hand sides with 30 parameters . This system of ODEs approximately equivalent to the fitted neural network. So it can be used as an interpretation of the neural network. For instance, one can state that baking soda is just a parameter, water decreases in time during mixing with the velocity that is proportional to other components. While the stability rate has more complex dynamics and depends on more components.
The data points for bath bombs are presented in dimensionless view, as well as the resulting system of ODEs is not provided due to the data protection policy.
In the article, we demonstrate a general concept of building a neural network representation of dynamical systems. Although we considered ODEs only with polynomial right-hand side, such nonlinear systems are widely used in different fields of automated control, robotics, mechanical systems, biology, chemical reactions, drug development, molecular dynamics.
The greatest advantage of the proposed Lie transform–based neural network is its equivalence to the differential equations. The weight matrices of a neural network correspond to the certain order of nonlinearities in the real system and have a physical explanation.
As soon as the proposed neural architecture has a good coincidence with traditional modeling methods, it can be useful for the investigation of dynamics in unknown parameter space. The promising properties of the proposed technique are the ability to learn dynamics with small training data sets and to interpret the data-driven model by translating it to the system of ODEs.
The questions of noisy data, truncation of matrix Lie transform, accuracy and convergence for larger systems are not discussed in the article. We also do not consider the optimal selection of loss functions and optimization methods for training. These questions should be investigated in further research.
The authors gratefully thank Prof. Dr. Yurij Senichev for his support in the development of Lie transform–based mapping methods for accelerator physics. Also, we appreciate the efforts of Roman Konoplev-Esgenburg for the explanation of the production process of bath bombs and for providing data.
-  Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. Tech. rep. (1997), https://arxiv.org/pdf/physics/9705023.pdf, last accessed 2019/03/03.
-  Mall, S., Chakraverty, S.: Comparison of artificial neural network architecture in solving ordinary differential equations. Advances in Artif. Neural Syst. (2013).
-  Lapedes, A., Farber, R.: Nonlinear signal processing using neural networks: Prediction and system modelling. In: Langley, P. (ed.) IEEE International Conference on Neural Networks, San Diego, CA, USA (1987).
-  Lewis, F.L., Ge, S.S.: Neural networks in feedback control systems. Mechanical Engineers’ Handbook: Instrumentation, Systems, Controls, and MEMS, vol. 2. John Wiley and Sons, Hoboken, NJ, USA, third edition (2005).
-  Chen, S., Billings, S.A.: Neural networks for nonlinear dynamic system modelling and identification. International Journal of Control 56(2), 319–346 (1992).
Sirignano, J., Spiliopoulos, K.: DGM: A deep learning algorithm for solving partial differential equations. Journal of Computational Physics (2018).
-  Chen, R., Rubanova, Y., Bettencourt, J., Duvenaud, D.: Neural ordinary differential equations, https://arxiv.org/pdf/1806.07366.pdf, last accessed 2019/03/03.
-  Ivanov, A., Andrianov, S.: Matrix Lie maps and polynomial neural networks for solving differential equations. Submitted to this Proceedings (2019).
-  Dragt, A.: Lie methods for nonlinear dynamics with applications to accelerator physics (2011), inspirehep.net/record/955313/files/TOC28Nov2011.pdf, last accessed 2019/03/03.
-  Andrianov, S.: A role of symbolic computations in beam physics. Computer Algebra in Sc. Comp., Lecture Notes in Computer Science, 6244, 19–30 (2010).
-  Andrianov, S.: The convergence and accuracy of the matrix formalism approximation. In: Proceedings of ICAP2012, Rostock, Germany, 93–95 (2012).
-  Gilbert, D., Heiner, M.: From Petri nets to differential equations - an integrative approach for biochemical network analysis. Lecture Notes in Computer Science, 181–200 (2006).
-  Martcheva, M.: An introduction to mathematical epidemiology, https://www.springer.com/gb/book/9781489976116, last accessed 2019/03/03.
-  Guidolin, M., Guseo, R.: On product cannibalization: a new Lotka-Volterra model for asymmetric competition in the ICTs. Tech. rep. (2016), http://paduaresearch.cab.unipd.it/9738/1/GUIDOLINGUSEO_SETT2016.pdf, last accessed 2019/03/03.
-  Senichev, Y., Lehrach, A., Maier, R., Zyuzin, D., Berz, M., Makino, K., Andrianov, S., Ivanov, A.: Storage ring EDM simulation: methods and results. In: Proceedings of ICAP2012, Rostock, Germany, 99–103 (2012).
-  Ivanov, A., Andrianov, S., Senichev, Y.: Simulation of Spin-orbit Dynamics in Storage Rings. Journal of Physics: Conference Series, Volume 747, N. 1 (2016).