Learning ergodic averages in chaotic systems

01/09/2020 ∙ by Francisco Huhn, et al. ∙ 0

We propose a physics-informed machine learning method to predict the time average of a chaotic attractor. The method is based on the hybrid echo state network (hESN). We assume that the system is ergodic, so the time average is equal to the ergodic average. Compared to conventional echo state networks (ESN) (purely data-driven), the hESN uses additional information from an incomplete, or imperfect, physical model. We evaluate the performance of the hESN and compare it to that of an ESN. This approach is demonstrated on a chaotic time-delayed thermoacoustic system, where the inclusion of a physical model significantly improves the accuracy of the prediction, reducing the relative error from 48 cost of solving two ordinary differential equations. This framework shows the potential of using machine learning techniques combined with prior physical knowledge to improve the prediction of time-averaged quantities in chaotic systems.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In the past decade, there has been a proliferation of machine learning techniques applied in various fields, from spam filtering [7] to self-driving cars [3], including the more recent physical applications in fluid dynamics [6, 4]. However, a major hurdle in applying machine learning to complex physical systems, such as those in fluid dynamics, is the high cost of generating data for training [6]. Nevertheless, this can be mitigated by leveraging prior knowledge (e.g. physical laws). Physical knowledge can compensate for the small amount of training data. These approaches, called physics-informed machine learning, have been applied to various problems in fluid dynamics [6, 4]. For example, [13, 5] improve the predictability horizon of echo state networks by leveraging physical knowledge, which is enforced as a hard constraint in [5]

, without needing more data or neurons. In this study, we use a hybrid echo state network (hESN)

[13], originally proposed to time-accurately forecast the evolution of chaotic dynamical systems, to predict the long-term time averaged quantities, i.e., the ergodic averages. This is motivated by recent research in optimization of chaotic multi-physics fluid dynamics problems with applications to thermoacoustic instabilities [8]. The hESN is based on reservoir computing [10]

, in particular, conventional Echo State Networks (ESNs). ESNs have shown to predict nonlinear and chaotic dynamics more accurately and for a longer time horizon than other deep learning algorithms

[10]. However, we stress that the present study is not focused on the accurate prediction of the time evolution of the system, but rather of its ergodic averages, which are obtained by the time averaging of a long time series (we implicitly assume that the system is ergodic, thus, the infinite time average is equal to the ergodic average [2].). Here, the physical system under study is a prototypical time-delayed thermoacoustic system, whose chaotic dynamics have been analysed and optimized in [8].

2 Echo State Networks

The ESN approach presented in [11] is used here. The ESN is given an input signal , from which it produces a prediction signal that should match the target signal , where is the discrete time index. The ESN is composed of a reservoir, which can be represented as a directed weighted graph with nodes, called neurons, whose state at time

is given by the vector

. The reservoir is coupled to the input via an input-to-reservoir matrix, , such that its state evolves according to

(1)

where is the weighted adjacency matrix of the reservoir, i.e. is the weight of the edge from node to node . In Eq. 1

, the hyperbolic tangent is used as the activation function. Finally, the prediction is produced by a linear combination of the states of the neurons

(2)

where . In this work, we are interested in dynamical system prediction. Thus, the target at time step is the input at time step , i.e. [13]. We wish to learn ergodic averages, given by

(3)

where is a cost functional, of a dynamical system governed by

(4)

where is the state vector and is a nonlinear operator. The training data is obtained via numerical integration of Eq. 4, resulting in the time series , where the different samples are taken at equally spaced time intervals , and is the length of the training data set. In the conventional ESN approach, and are generated once and fixed to satisfy the Echo State Property [9]. Only is trained to minimize the mean-squared-error

(5)

To avoid overfitting, we use ridge regularization, so the optimization problem is , where is the regularization factor. Because the prediction is a linear combination of the reservoir state , the optimal can be explicitly obtained with , where

is the identity matrix and

and are the column-concatenation of the various time instants of the output data, , and corresponding reservoir states, , respectively. After the optimal is found, the ESN can be used to predict the time evolution of the system. This is done by looping back its output to its input, i.e. , which, on substitution into Eq. 1, results in

(6)

with . Interestingly, Eq. 6 shows that if the reservoir follows an evolution of states , where is the number of prediction steps, then is also possible, because is a solution of Eq. 6. This implies that either the attractor of the ESN (if any) is symmetric, i.e. if some is in the ESN’s attractor, then so is ; or the ESN has a co-existing symmetric attractor. While this seemed not to have been an issue in short-term prediction, such as in [13, 5], it does pose a problem in the long-term prediction of statistical quantities. This is because the ESN, in its present form, can not generate non-symmetric attractors. This symmetry needs to be broken to work with a general non-symmetric dynamical system. This can be done by including biases [11]. However, the addition of a bias can make the reservoir prone to saturation (results not shown), i.e.

, and thus great care needs to be taken in the choice of hyperparameters. In this paper, we break the symmetry by exploiting prior knowledge on the physics of the problem under investigation with a hybrid ESN.

3 Physics-informed and hybrid Echo State Network

The ESN’s performance can be increased by incorporating physical knowledge during training [5] and/or prediction [13]. This physical knowledge is usually present in the form of a reduced-order model (ROM) that can generate (imperfect) predictions. The authors of [5] introduced a physics-informed ESN (PI-ESN), which constrains the physics as a hard constraint in a physics loss term. The prediction is consistent with the physics, but the training requires nonlinear optimization. The authors of [13]

introduced a hybrid echo state network (hESN), which incorporates incomplete physical knowledge by feeding the prediction of the physical model into the reservoir and into the output. This requires ridge regression. Here, we use a hESN (Figure 

1) because we are not interested in constraining the physics as a hard constraint for an accurate short-term prediction [5].

Figure 1: Schematic of the hybrid echo state network. ROM: reduced-order model. R: reservoir. Superscript R: reduced-order model. Superscript I: traditional ESN.

4 Learning the ergodic average of an energy

We use a prototypical time-delayed thermoacoustic system composed of a longitudinal acoustic cavity and a heat source modelled with a nonlinear time-delayed model [14, 8], which has been used to optimize ergodic averages in [8] with a dynamical systems approach. The non-dimensional governing equations are

(7)

where , , and are the non-dimensionalized acoustic velocity, pressure, damping and heat-release rate, respectively. is the Dirac delta. These equations are discretized by using Galerkin modes

(8)

which results in a system of oscillators, which are nonlinearly coupled through the heat released by the heat source

(9)

where is the heat source location and is the modal damping [8]. The heat release rate, , is given by the modified King’s law [8], , where and are the heat release intensity parameter and the time delay, respectively. With the nomenclature of Section 2, . Using 10 Galerkin modes (), and results in a chaotic motion (Fig. 2), with the leading Lyapunov exponent being [8]. (The leading Lyapunov exponent measures the rate of (exponential) separation of two close initial conditions, i.e. an initial separation grows asymptotically like .) However, for the same choice of parameter values, the solution with is a limit cycle (i.e. a periodic solution).

Figure 2: Acoustic velocity at the flame location.

The echo state network is trained on data generated with , while the physical knowledge (ROM in Fig. 1) is generated with only. As relevant to optimization of chaotic acoustic oscillations [8], we wish to predict the time average of the instantaneous acoustic energy, . The reservoir is composed of 100 units, a modest size, half of which receives their input from , while the other half receives it from the output of the ROM, . The entries of

are randomly generated from the uniform distribution

. The matrix is highly sparse, with only 3% of non-zero entries from a uniform distribution between -1 and 1. Finally, is scaled such that its spectral radius is 0.1 and 0.3 for the ESN and the hESN, respectively. The time step is . The network is trained for units, which corresponds to 6 Lyapunov times, i.e. . The data is generated by integrating Eq. 9 in time with , resulting in . In the hESN, the ROM is obtained by integrating the same equations, but with (one Galerkin mode only) unless otherwise stated. Ridge regression is performed with . The hyperparameters’ values are taken from the literature [13, 5] and a grid search.

On the one hand, Fig. (a)a shows the instantaneous error of the first modes of the acoustic velocity and pressure for the ESN, hESN and ROM. None of these can accurately predict the instantaneous state of the system. On the other hand, Fig. (b)b shows the error of the prediction of the average acoustic energy. Once again, the ROM alone does a poor job at predicting the statistics of the system, with an error of 50%. This should not come at a surprise since, as discussed previously, the ROM does not even produce a chaotic solution. The ESN, trained on data only, performs marginally better, with an error of 48%. In contrast, the hESN predicts the time-averaged acoustic energy satisfactorily, with an error of about 7%. This is remarkable, since both the ESN and the ROM do a poor job at predicting the average acoustic energy. However, when the ESN is combined with prior knowledge from the ROM, the prediction becomes significantly better. Moreover, while the hESN’s error still decreases at the end of the prediction period,

, which is 5 times the training data time, the ESN and the ROM stabilize much earlier, at a time similar to that of the training data. This result shows that complementing the ESN with a cheap physical model (only 10% the number of degrees of freedom of the full system) can greatly improve the accuracy of the predictions, with no need for more data or neurons. With a slightly more accurate ROM (

), the error further decreases to 3% (result not shown).

(a) Absolute error of prediction.
(b) Relative error of prediction.
Figure 3: Errors on the prediction from ESN (blue), hESN (red) and ROM (green).

The performance of the network is sensitive to the hyperparameters and the values that yield good performance for a certain may perform poorly for another set. Figure 4 shows the phase plots of the full model and of the hESN when the physical parameters are varied. The hyperparameters we selected perform well with the system under investigation (, ) (Fig. 4, middle panel). However, if no further fine tuning is carried out, the same hyperparameters perform poorly for a different set of physical parameters (Fig. 4, left and right panels). This well-known drawback in machine learning techniques [1] calls for robust methods for the automatic selection of the optimal hyperparameters. This is the scope of other current studies.

Figure 4: Phase plot for different values of for system (black) and hESN (red).

5 Conclusion and future directions

We propose the use of echo state networks informed with incomplete prior physical knowledge for the prediction of time averaged cost functionals in chaotic dynamical systems. We apply this to chaotic acoustic oscillations, which is relevant to aeronautical propulsion. The inclusion of physical knowledge comes at a low cost and significantly improves the performance of conventional echo state networks from a 48% error to 7%, without requiring additional data or neurons. The ability of the proposed ESN can be exploited in the optimization of chaotic systems by accelerating computationally expensive shadowing methods [12]. For future work, (i) the performance of the hybrid echo state network should be compared against those of other physics-informed machine learning techniques; (ii) robust methods for hyperparameters’ search are needed; and (iii) this technique is currently being applied to larger scale problems. In summary, the proposed framework is able to learn the ergodic average of a fluid dynamics system, which opens up new possibilities for the optimization of highly unsteady problems.

References

  • [1] Y. Bengio (2012) Practical recommendations for gradient-based training of deep architectures. In Neural Networks: Tricks of the Trade: Second Edition, pp. 437–478. External Links: ISBN 978-3-642-35289-8, Document, Link Cited by: §4.
  • [2] G. D. Birkhoff (1931) Proof of the ergodic theorem. Proceedings of the National Academy of Sciences 17 (12), pp. 656–660. External Links: Document, ISSN 0027-8424, Link, https://www.pnas.org/content/17/12/656.full.pdf Cited by: §1.
  • [3] M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba (2016-04) End to End Learning for Self-Driving Cars. arXiv e-prints, pp. arXiv:1604.07316. External Links: 1604.07316 Cited by: §1.
  • [4] S. L. Brunton, B. R. Noack, and P. Koumoutsakos (2020) Machine learning for fluid mechanics. Annual Review of Fluid Mechanics 52 (1). External Links: Document, Link, https://doi.org/10.1146/annurev-fluid-010719-060214 Cited by: §1.
  • [5] N. A. K. Doan, W. Polifke, and L. Magri (2019) Physics-informed echo state networks for chaotic systems forecasting. In Computational Science – ICCS 2019, Cham, pp. 192–198. External Links: ISBN 978-3-030-22747-0 Cited by: §1, §2, §3, §4.
  • [6] K. Duraisamy, G. Iaccarino, and H. Xiao (2019) Turbulence modeling in the age of data. Annual Review of Fluid Mechanics 51 (1), pp. 357–377. External Links: Document, Link, https://doi.org/10.1146/annurev-fluid-010518-040547 Cited by: §1.
  • [7] T. S. Guzella and W. M. Caminhas (2009) A review of machine learning approaches to spam filtering. Expert Systems with Applications 36 (7), pp. 10206 – 10222. External Links: ISSN 0957-4174, Document, Link Cited by: §1.
  • [8] F. Huhn and L. Magri (2020) Stability, sensitivity and optimisation of chaotic acoustic oscillations. Journal of Fluid Mechanics 882, pp. A24. External Links: Document Cited by: §1, §4, §4.
  • [9] H. Jaeger and H. Haas (2004) Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science 304 (5667), pp. 78–80. External Links: Document, ISSN 0036-8075, Link, https://science.sciencemag.org/content/304/5667/78.full.pdf Cited by: §2.
  • [10] M. Lukoševičius and H. Jaeger (2009)

    Reservoir computing approaches to recurrent neural network training

    .
    Computer Science Review 3 (3), pp. 127 – 149. External Links: ISSN 1574-0137, Document, Link Cited by: §1.
  • [11] M. Lukoševičius (2012) A practical guide to applying echo state networks. pp. 659–686. External Links: ISBN 978-3-642-35289-8, Document, Link Cited by: §2.
  • [12] A. Ni and Q. Wang (2017) Sensitivity analysis on chaotic dynamical systems by non-intrusive least squares shadowing (NILSS). Journal of Computational Physics 347, pp. 56 – 77. External Links: ISSN 0021-9991, Document, Link Cited by: §5.
  • [13] Pathak,Jaideep, Wikner,Alexander, Fussell,Rebeckah, Chandra,Sarthak, H. R., Girvan,Michelle, and Ott,Edward (2018) Hybrid forecasting of chaotic processes: using machine learning in conjunction with a knowledge-based model. Chaos: An Interdisciplinary Journal of Nonlinear Science 28 (4), pp. 041101. External Links: Document, Link, https://doi.org/10.1063/1.5028373 Cited by: §1, §2, §3, §4.
  • [14] T. Traverso and L. Magri (2019) Data assimilation in a nonlinear time-delayed dynamical system with lagrangian optimization. In Computational Science – ICCS 2019, Cham, pp. 156–168. External Links: ISBN 978-3-030-22747-0 Cited by: §4.