Lie Transform Based Polynomial Neural Networks for Dynamical Systems Simulation and Identification

02/05/2018 ∙ by Andrei Ivanov, et al. ∙ 0

In the article, we discuss the architecture of the polynomial neural network that corresponds to the matrix representation of Lie transform. The matrix form of Lie transform is an approximation of general solution for the nonlinear system of ordinary differential equations. Thus, it can be used for simulation and modeling task. On the other hand, one can identify dynamical system from time series data simply by optimization of the coefficient matrices of the Lie transform. Representation of the approach by polynomial neural networks integrates the strength of both neural networks and traditional model-based methods for dynamical systems investigation. We provide a theoretical explanation of learning dynamical systems from time series for the proposed method, as well as demonstrate it in several applications. Namely, we show results of modeling and identification for both well-known systems like Lotka-Volterra equation and more complicated examples from retail, biochemistry, and accelerator physics.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Modeling and control of complex dynamical systems require techniques for consideration of nonlinearities and uncertainties. On the face of it, artificial neural networks could provide a suitable approach for learning dynamical systems. The applications cover different problems, such as solving ordinary differential equations (ODEs) [1, 2], signal processing [3], feedback control systems [4], modeling and identification [5], and others. In the article [6]

, the neural network is trained to satisfy the differential operator, initial and boundary conditions for the partial differential equation. The backpropagation technique through an ODE solver is proposed in

[7]. The comparison of the recent research on solving differential equation with neural networks can be found in [8].

In the article, we describe a neural architecture that differs from the described above techniques. Firstly, it is not necessary to train the proposed network for simulation purposes. If the differential equations are provided, the weights of the network can be directly computed. Secondly, we completely avoid numerical ODE solvers in both simulation and data-driven system learning by describing the dynamics with maps instead of numerical step-by-step integrating.

The proposed architecture is a neural network representation of a Lie propagator for dynamical systems integration that is introduced in [9] and is commonly used in the charged particle dynamics simulation. We consider dynamical systems that can be described by nonlinear ordinary differential equations,

(1)

where is an independent variable,

is a state vector, and

means -th Kronecker power of vector . There is an assumption that function can be expanded in Taylor series with respect to the components of .

2 Proposed Neural Network

2.1 Matrix Form of Lie Transform

The solution of (1) in its convergence region can be presented in the series [9, 11],

(2)

where In [10], it is shown how to calculate matrices by introducing new matrices . The main idea is replacing (1) by the equation

(3)

This equation should be solved with initial condition , where

is the identity matrix. Theoretical estimations of accuracy and convergence of the truncated series in solving of ODEs can be found in

[9, 11].

The transformation can be considered as a discrete approximation of the evolution operator of (1) for initial time and interval . This means that the evolution of the state vector during time can be approximately calculated as . Hence, instead of solving the system of ODEs numerically, one can apply a calculated map and avoid a step-by-step integrating.

2.2 Neural Network Representation of Matrix Lie Transform

The proposed neural network implements map in form of

(4)

where , are weight matrices, and means -th the Kronecker power of vector . For a given system of ODEs (1), one can compute matrices in accordance with (3) up to the necessary order of nonlinearity.

Fig. 1 presents a neural network for map (4) up to the third order of nonlinearities for a two-dimensional state. In each layer, the input vector is consequently transformed into and , where weighted sum is applied. The output Y equals to the sum of results from every layer. In the example, we reduce Kronecker powers for decreasing of weights matrices dimension (e.g., ).

Figure 1: Neural network representation of third order matrix Lie map.

2.3 Fitting Neural Network

To fit a proposed neural network, the training data is presented as a multivariate time series (table 1) that describes the evolution of the state vector of the dynamical system in a discrete time. In a general case, each step should be described as map , but if the system (1) is time independent, then weights depends only on time interval .

In the article, we consider only autonomous systems of ODEs with constant discretization . This allows using Lie transformed–based neural network with the shared across time weight matrices . A time-dependent right-hand side in (1) can be considered by introducing a deeper network architecture.

INPUT OUTPUT
Table 1: Discrete states of a dynamical system for training the proposed network.

On these assumptions, the input of the neural network is state , and the output is

. In all provided examples, the loss function is mean squared error and the Adamax algorithm is used for fitting.


3 Simulation of Dynamical Systems

In the section, we describe how matrix Lie map can replace numerical methods for solving well-known systems of ODEs. We consciously limit ourselves to a visual comparison of the approaches. The examples of the accuracy estimation can be found in [8].

3.1 Simple Models

To demonstrate the application of matrix Lie maps for dynamics simulation, we consider three simple dynamical systems. The Lotka–Volterra system is taken in form of that can be derived from the classical form by a change of variables. These equations are commonly used for the description of population dynamics in biological, social, and economic systems. The Van der Pol oscillator is defined as and can be used for the description of pneumatic hammer, steam engine, periodic occurrence of epidemics, economic crises, depressions, and heartbeat. The Henon–Heiles model is an example of systems where dynamical chaos arises. The system of differential equations can be described as , , , . The chaos theory has applications in meteorology, physics, environmental science, computer science, engineering, and philosophy.

Figure 2: Simulation of dynamics in phase space for the Lotka–Volterra equation (left) and the Van der Pol oscillator (right). The red lines correspond to the traditional Runge–Kutta method. The blue dots are for simulation by the matrix Lie map.

In Figs. 2 and 3, results of the simulation of these systems are shown. Fig. 2 represents the phase space dynamics (in coordinates) of the Lotka–Volterra system and the Van der Pol oscillator. Fig. 3 shows a Poincaré map , which is calculated by integrating the initial state vector , and . On both figures, the red lines and dots correspond to the numerical integration using Runge–Kutta method of fourth order. The blue dots are for simulation by the matrix Lie transform of third order of nonlinearity.

Figure 3: Simulation of dynamical chaos by a traditional numerical method (left) and a matrix Lie transform (right).

3.2 Biochemical Reaction Simulation

In this example, we demonstrate results of simulation of a biochemical system that is described in [12] and represents the influence of the Raf kinase inhibitor protein (RKIP). In the article, the influence of RKIP is investigated via numerical analysis of nonlinear ordinary differential equations using the MATLAB ode45 function that is based on step-by-step integration. Instead of using a step-by-step numerical integration method, one can build a polynomial neural network and utilize it for system simulation.

The system of differential equation consists of 11 nonlinear equations that describe the biochemical network. We built a second-order Lie map for this system and used it for simulation with the initial condition from [12]. The results of the simulation are shown in Fig. 4 and have a good coincidence with the ones presented in [12].

Figure 4: Simulation of biochemistry reaction with Lie transform–based neural network.

4 Learning Dynamical Systems from Data

In this section, we describe the examples of the application of Lie transform–based neural networks for data-driven identification of dynamical systems.

4.1 Epidemic Dynamics

For this example, we first generate data from the equations of the SIR epidemic model [13]. The model consists of three compartments: for the number susceptible, for the number of infectious, and for the number recovered.

(5)

We consider a system with parameters , and on time interval [0; 10]. For the data generation, we use traditional Runge–Kutta methods of fourth order with time step . We define the training set as a particular solution of the system with initial condition , and . The two testing sets were generated from the system as the solutions start with new initial condition for test1, and for test 2. After data is generated, we do not use differential equation further.


train test 1 test 2
LSTM neural network                Lie transform–based neural network
Figure 5: Memorization of training data by LSTM neural network (left) and generalization of dynamics by Lie transform–based neural network (right). Plots contain the number of infectious (I) generated by equations (green) and predictive models (red).

To compare the proposed approach with traditional architectures, the LSTM and Lie transform–based neural networks have been fitted only with the training solution. Then the prediction for testing initial conditions that were not presented during fitting is examined. The neural networks configurations can be found following the link provided at the end of the article.

As shown in Fig. 5, the LSTM neural network just memorized training data. It tends to predict the same solution (training one) regardless of initial conditions, while a Lie transform–based neural network is able to correctly predict the dynamics for previously unseen initial conditions. Moreover, it preserves the physical properties of the system and can recognize the fixed point that corresponds to the absence of the epidemic.

4.2 iPad and iPhone Sales

In the article [14] the authors investigate the dynamics of iPhone and iPad sales with differential equations. They suggest analytic formulas for systems of nonlinear ODEs and fit parameters based on time-series data. This is a traditional approach for system identification. On the other hand, using the described above technique one can identify dynamics utilizing Lie transform–based neural network without knowledge of appropriate equations.

Figure 6: Identification of iPhone and iPad sales with Lie transform–based network of the fifth order of nonlinearity (dots for training data, lines for prediction).

In this example, we generated data for iPad and iPhone sales from the plots presented in the article [14]. Then we fitted a fifth order Lie transform–based network with this data. Note that in this case, we did not use any specific assumption on the possible view of equations as made in the original article. The order of nonlinearities is chosen based on the experimentation.

After fitting the neural network, one can receive a Lie map

where and are sales for iPhone and iPad, respectively, at time .

5 Applications

This section is devoted to the practical applications of the developed technique. Initially, we developed the proposed method for the high-performance simulation of charged particle dynamics. In this article, we briefly mention the key concepts of applying Lie map for modeling of the particle accelerators and storage rings. The second application corresponds to learning a production dynamics with only 10 data points and providing model interpretation. The example is taken from the cosmetic industry.

5.1 Charged Particle Accelerators

Charged particle accelerator consists of a number of physical equipment (e.g., quadrupoles, bending magnets, and others). The design of accelerators and nonlinear dynamics investigation require an accurate computer model of such a complicated system.

The particle dynamics in the physical control element can be described by a system of ODEs that has a complex nonlinear form. For instance, the equation of a particle motion depends on electromagnetic fields and has a 9-dimensional state vector for spin-orbit dynamics. For long-term dynamics investigation, the traditional step-by-step numerical methods are not suitable because of the performance limitation. Instead of solving differential equations directly, one can estimate a matrix Lie map for each control element in an accelerator. By combining such maps consequently, one can obtain a deep polynomial neural network that represents the whole accelerator ring (see Fig. 7).

Figure 7: A neural network representation of a charged particle accelerator.

The Lie transform–based mapping approach is commonly used in accelerator physics. For example, in articles [15, 16]

, the proposed method for simulation of nonlinear spin-orbit dynamics in EDM (electric dipole moment) search project is discussed.

5.2 Bath Bombs and Bath Fizzies

Consider a production of a chemical product that implies mixing of 11 source components such as baking soda, olive oil, SLES, water, and others. The known set of component rates in the amount of 10 data points (bath bombs) that leads to the stable characteristic of the product after one minute of mixing is known. Having this information, one needs to produce a new product (bath fizzies) that has a modified SLES component that is equal to zero (see table 2).

Bath bombs
1 2 10
Baking soda 0.74 0.83 0.65
Olive oil 0.57 0.77 0.60
SLES 0.76 0.60 0.46
Water 0.09 0.14 0.08
Bath fizzies
1 2 10
Baking soda 0.74 0.83 0.65
Olive oil 0.57 0.77 0.60
SLES 0.00 0.00 0.00
Water ? ? ?
Table 2: Bath bombs (training) and bath fizzies data sets.

The first challenge in this problem is a limited dataset with only 10 points. The second one is the extrapolation issue. One has to build a model with data that contain SLES and provide predictions where SLES is not presented. This makes it almost impossible to use traditional machine learning methods.

At the same time, a Lie transform–based neural network allows building a model that can provide reasonable results. To achieve this, one has to make two assumptions. Firstly, we consider production as a continuous process that can potentially be described by a system of ODEs. We also assume that these equations are time-independent, allowing us to consider constant Lie map for each time step. Secondly, the initial components in bath bombs and bath fizzies are considered as different initial conditions of the dynamical system.

Under these assumptions, one can represent available data points as a dynamical process from to minute with discrete time step . The time series representation of first data point is shown in table 3, where is a state vector of the initial components ( is baking soda, , is SLES, is water, and is product stability), and is a Lie transform-based layers that represent the dynamics. NA means not available and implies hidden process states. Thus, for each of 10 samples, as an input for the neural network, we use the initial component rates. The output is the stability of the product.

Because of our assumptions, we can model the production process by a Lie transform–based neural network that consists of 199 consistent Lie maps with shared across time weight matrices (table 3). During fitting of the Lie map, the neural network recovers the dynamics and estimates data points that are marked as NA. After the neural network is fitted with bath bombs, one can use it to predict the dynamics of bath fizzies. The optimal water rates with modified SLES component can be found by varying input variable because of the preservation of output variable at the final mixing time .

Bath bombs
NA NA NA
NA NA NA
0.76 NA NA NA
0.09 NA NA NA
0.00 NA NA 1.00
Bath fizzies
NA NA NA
NA NA NA
0.00 NA NA NA
? NA NA NA
0.00 NA NA 1.00
Table 3: Time series representation of the first point from training data set.

The predictions provided by traditional machine learning method are presented in table 4

. Note that the results provided by traditional methods are expected by their design but are physically incorrect. Linear regression provides nonphysical negative rates. Decision tree predicts the same values as in training data. Support vector regression provides almost constant value close to the mean value of water in the training set. Only considering data points as initial conditions of a dynamical process provides physically explainable growth of necessary water in case of SLES absence. This result is also approved by chemical engineers.

1 2 10
Linear Regression -0.23 -0.12 -0.13
Decision Tree 0.76 0.60 0.46
SVR 0.11 0.11 0.11
Lie Map NN    0.64    0.84    0.36
Table 4: Prediction of water for bath fizzies provided by different methods.

There is also a possibility to translate found weight matrices of the neural network to the equations. To implement this, one has to find such formulas for the system of ODEs that provide the found weights after implementing the described in section 2.1 algorithm. For instance, we parameterized the right-hand side of ODEs up to the second order of nonlinearity and derive the following system,

which consists of polynomial right-hand sides with 30 parameters . This system of ODEs approximately equivalent to the fitted neural network. So it can be used as an interpretation of the neural network. For instance, one can state that baking soda is just a parameter, water decreases in time during mixing with the velocity that is proportional to other components. While the stability rate has more complex dynamics and depends on more components.

Supplementary code
https://github.com/andiva/DeepLieNet/blob/master/demo/accelerator.ipynb
https://github.com/andiva/DeepLieNet/blob/master/demo/BathBombs.ipynb
The data points for bath bombs are presented in dimensionless view, as well as the resulting system of ODEs is not provided due to the data protection policy.

6 Conclusion

In the article, we demonstrate a general concept of building a neural network representation of dynamical systems. Although we considered ODEs only with polynomial right-hand side, such nonlinear systems are widely used in different fields of automated control, robotics, mechanical systems, biology, chemical reactions, drug development, molecular dynamics.

The greatest advantage of the proposed Lie transform–based neural network is its equivalence to the differential equations. The weight matrices of a neural network correspond to the certain order of nonlinearities in the real system and have a physical explanation.

As soon as the proposed neural architecture has a good coincidence with traditional modeling methods, it can be useful for the investigation of dynamics in unknown parameter space. The promising properties of the proposed technique are the ability to learn dynamics with small training data sets and to interpret the data-driven model by translating it to the system of ODEs.

The questions of noisy data, truncation of matrix Lie transform, accuracy and convergence for larger systems are not discussed in the article. We also do not consider the optimal selection of loss functions and optimization methods for training. These questions should be investigated in further research.

The proposed Lie transform–based neural network, an algorithm for calculating of the map for the system of ODEs, as well as examples, are implemented in Python (Keras/TensorFlow) and presented at GitHub repository:

https://github.com/andiva/DeepLieNet.

7 Acknowledgments

The authors gratefully thank Prof. Dr. Yurij Senichev for his support in the development of Lie transform–based mapping methods for accelerator physics. Also, we appreciate the efforts of Roman Konoplev-Esgenburg for the explanation of the production process of bath bombs and for providing data.

References

  • [1] Lagaris, I.E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. Tech. rep. (1997), https://arxiv.org/pdf/physics/9705023.pdf, last accessed 2019/03/03.
  • [2] Mall, S., Chakraverty, S.: Comparison of artificial neural network architecture in solving ordinary differential equations. Advances in Artif. Neural Syst. (2013).
  • [3] Lapedes, A., Farber, R.: Nonlinear signal processing using neural networks: Prediction and system modelling. In: Langley, P. (ed.) IEEE International Conference on Neural Networks, San Diego, CA, USA (1987).
  • [4] Lewis, F.L., Ge, S.S.: Neural networks in feedback control systems. Mechanical Engineers’ Handbook: Instrumentation, Systems, Controls, and MEMS, vol. 2. John Wiley and Sons, Hoboken, NJ, USA, third edition (2005).
  • [5] Chen, S., Billings, S.A.: Neural networks for nonlinear dynamic system modelling and identification. International Journal of Control 56(2), 319–346 (1992).
  • [6]

    Sirignano, J., Spiliopoulos, K.: DGM: A deep learning algorithm for solving partial differential equations. Journal of Computational Physics (2018).

  • [7] Chen, R., Rubanova, Y., Bettencourt, J., Duvenaud, D.: Neural ordinary differential equations, https://arxiv.org/pdf/1806.07366.pdf, last accessed 2019/03/03.
  • [8] Ivanov, A., Andrianov, S.: Matrix Lie maps and polynomial neural networks for solving differential equations. Submitted to this Proceedings (2019).
  • [9] Dragt, A.: Lie methods for nonlinear dynamics with applications to accelerator physics (2011), inspirehep.net/record/955313/files/TOC28Nov2011.pdf, last accessed 2019/03/03.
  • [10] Andrianov, S.: A role of symbolic computations in beam physics. Computer Algebra in Sc. Comp., Lecture Notes in Computer Science, 6244, 19–30 (2010).
  • [11] Andrianov, S.: The convergence and accuracy of the matrix formalism approximation. In: Proceedings of ICAP2012, Rostock, Germany, 93–95 (2012).
  • [12] Gilbert, D., Heiner, M.: From Petri nets to differential equations - an integrative approach for biochemical network analysis. Lecture Notes in Computer Science, 181–200 (2006).
  • [13] Martcheva, M.: An introduction to mathematical epidemiology, https://www.springer.com/gb/book/9781489976116, last accessed 2019/03/03.
  • [14] Guidolin, M., Guseo, R.: On product cannibalization: a new Lotka-Volterra model for asymmetric competition in the ICTs. Tech. rep. (2016), http://paduaresearch.cab.unipd.it/9738/1/GUIDOLINGUSEO_SETT2016.pdf, last accessed 2019/03/03.
  • [15] Senichev, Y., Lehrach, A., Maier, R., Zyuzin, D., Berz, M., Makino, K., Andrianov, S., Ivanov, A.: Storage ring EDM simulation: methods and results. In: Proceedings of ICAP2012, Rostock, Germany, 99–103 (2012).
  • [16] Ivanov, A., Andrianov, S., Senichev, Y.: Simulation of Spin-orbit Dynamics in Storage Rings. Journal of Physics: Conference Series, Volume 747, N. 1 (2016).