Nonlinear system identification is a wide and intensely researched topic, aiming at the estimation of dynamical systems directly form data. Multiple methods have been developed, where the commonly used model structures are: NARX, nonlinear state space, block-oriented, see for an overview. However, even if the resulting models have good simulation or prediction performance, the representation of the system remains confined in the nonlinear system class. While several nonlinear control methods have been developed (e.g. feedback linearization, backstepping, sliding mode control, to name a few ), they are often complicated to apply and there is no systematic approach for shaping the performance of the closed-loop system in contrast to the approaches of the linear time invariant (LTI) framework. While LTI control is well advanced, designs are limited to operate in the neighbourhood of given linearization points. Hence, pressing needs in engineering led to the idea of developing various linear embeddings of nonlinear systems to apply the powerful LTI control methods with global stability and performance guarantees.
One such embedding technique is based on the Koopman framework, where the concept is to lift the nonlinear state space to a (possibly) infinite dimensional space through so-called observable functions. The dynamics of the original system are preserved and governed by the linear Koopman operator , . In practice, if a dictionary of a finite number of observables is chosen a priori and used to construct time shifted data matrices, the linear Koopman-based model can be calculated via least squares . One such approach, called Dynamic Mode Decomposition (DMD) , is based on constructing the time shifted data matrices using the original states of the system. If the dictionary consists of nonlinear functions of the state, this technique is known as Extended DMD (EDMD) . Besides issues that arise with the presence of noise and biasedness of the estimates, the main difficulty lays with choosing the lifting functions such that, on the lifted state-space, an LTI model exists that can well capture the dynamic behavior of the original nonlinear system.
Learning the lifting functions from data has been addressed recently using Artificial Neural Networks (ANNs). A common approach is to construct autoencoders, to represent the lifting function and its inverse,-, and enforce the linearity condition of the lifted state transition. Alternatively,  proposes to specify the entire dictionary of observables as the outputs of an ANN, and perform an EDMD type of regression for model estimation. While  addresses partial state observations, availability of full state measurements is a common assumption in the Koopman identification literature. Moreover, only a few papers present examples where measurement noise is present (e.g. ) and often only the robustness of the methods is analysed instead of ensuring stochastic consistency of the estimators. Furthermore, in this research area, the treatment of inputs has only recently been addressed, either through a nonlinear lift  or by using state and input dependent observables, together with input increments .
To address these issues, we introduce a Koopman-based state-space encoder model and a corresponding estimation method, implemented based on the deepSI toolbox111deepSI toolbox available at https://github.com/GerbenBeintema/deepSI in Python . We summarize the main characteristics and contributions as follows:
Constructability-based formulation of a lifted space encoder (nonlinear mapping) based on deep-ANN
Formulation of an identification approach for estimating Koopman models with Output Error (OE) noise structure in state-space using -step ahead prediction
Incorporation of lifted-state dependent linear effect of the input for general representation of nonlinear systems
Construction that allows for both full and partial state availability
The paper is structured as follows. Section II details the general Koopman framework and we discuss the notions of observability and state constructability in the Koopman form together with the role of inputs. Section III describes the proposed Koopman-based encoder and the associated model structure together with the used optimization method. In Section IV, the approach is tested using a Van der Pol oscillator and the Silverbox benchmark , followed by a discussion of the results. The conclusions are presented in Section V.
Ii Behavior of Koopman embeddings
This section details the Koopman framework focusing on a finite dimensional lifted form. Next, we briefly discuss observability and constructability notions in the original and lifted forms. We show that, while the system behavior can be represented by a linear form, a nonlinear constraint remains on the initial conditions to ensure one-to-one connection between the solution sets.
Consider a discrete-time nonlinear autonomous system:
with the state variable, is a bounded nonlinear function and is the discrete time step. The Koopman framework proposes an alternative representation of system (1) by introducing so-called observable functions , with a Banach function space and . As described in , the Koopman operator associated with (1) and is defined through:
where denotes function composition and (2) is equal to:
An important property of the Koopman operator is that it is linear when
is a vector space of functions. Considering two observables and scalars , if (2) holds, then it implies:
proving the linearity property. While often the existence of the Koopman operator requires to be spanned by an infinite number of basis functions, for practical purposes, an -dimensional linear subspace is considered, with . As detailed in , with a projection operator , the finite-dimensional approximation of the Koopman operator is given by:
In practice, the Koopman matrix representation is used, such that the element-wise relation is satisfied:
However, the main difficulty of the Koopman framework is the choice of the lifting functions such that the resulting observables generate a Koopman invariant subspace . Furthermore, it is often not clearly stated in the literature on the subject that a linear system whose dynamics are driven by the Koopman matrix is only equivalent in terms of behavior (collections of all solution trajectories) with the original nonlinear system (1) if explicit nonlinear constraints are defined for the initial condition of the lifted state. We explore this next using a simple example.
Ii-B Linear representations subject to nonlinear constraints
To illustrate the concept, consider a nonlinear system represented by (1) with a polynomial , similar to the one described in . Denote to be the element of . In this notation, the system dynamics are described as follows:
with constant parameters . By considering solutions of (8) only on with initial condition , the feasible trajectories are given by:
To represent the system in the Koopman form, the following observables are chosen: , and . Then, the dynamics are represented by:
Based on (10), consider the system of dimension , with ; the solution set is described as:
Note that (11) represents an unrestricted LTI behavior. It is easy to show that , as any with for which will not correspond to a solution of (8), i.e. . By introducing the constraint , the solution set (11) with constraint is:
Then, it is possible to show that . Our example shows that, to have a bijective relation between the solution sets, additional constraints need to be imposed on the Koopman form, or, as we call it now, embedding of (1).
Ii-C Observability, constructability and extension to inputs
Ii-C1 Autonomous case
Consider the system (1) having the output defined as:
with and is the composition of with itself times. As described in , the system satisfies the observability rank condition at if the rank of the analytical map222The rank of the Jacobian matrix of at . is equal to . If this condition is met, the system is strongly locally observable at , where is a neighbourhood of and there exists an analytic function , such that .
In the Koopman form, assuming that the output function is in the span of the lifted states, i.e., , with , the observability map can be defined as follows:
where, as observed in section II-B, it is also necessary to consider the nonlinear constraints . The map should be locally invertible to uniquely determine , i.e. there exists , such that . This is a different point of view than in the work , where the observability notions are discussed based on an explicit definition of the lifting map. Similar to (15), the notion of constructability refers to uniquely determining using current and past measurements, that is, with . is the constructability map and denotes its inverse. In the proposed ANN implementation, we formulate the encoder as a nonlinear function in order to reconstruct a state that can be associated with the Koopman form of the nonlinear system.
Ii-C2 Systems with input
To extend the considerations to the non-autonomous case, we consider the class of nonlinear control-affine systems:
with a potential nonlinear function and input . The treatment of the inputs in the lifted form is a topic of debate with many different approaches present in the literature. In general, an LTI form is assumed due to its ease of use with existing linear control methods . However, this form may be insufficient to capture the nonlinear behavior of (16). Based on the results of  for continuous time, we consider an input-affine Koopman form:
Let and . The observability map is described as:
where, for ease of readability, , represents a nonlinear function containing all the remaining terms in the expansion of and . As can be seen, for the lifted form (17), to determine for future input and output data requires the inversion of an even more complex nonlinear map. The same holds for constructability, where the aim is to determine based on past input-output data. Similar to the autonomous case, the encoder is formulated as a nonlinear function that estimates the inverse of the constructability map to determine the lifted state using past measurements.
Iii Identification framework
Based on the the constructability map and the state-dependent affine mapping for the input discussed in Section II, we have now all the ingredients to develop an identification method for a data-driven Koopman embedding of a nonlinear system without prior selection of the observables. Due to the shown nonlinearity of the constructability map, a deep-ANN-based function estimator is needed to determine the state basis .
Iii-a Data generating system
Similar to (16), the data generating system is considered to be a nonlinear control affine system:
with being an additive zero-mean (possibly coloured) noise. The stochastic system in (18) corresponds to an OE noise setting where our objective is to estimate a model of the deterministic (process) part. Next, we define the chosen model structure for the lifted Koopman form.
Iii-B Model structure
We apply the estimation concept from ,  and develop an implementation of the resulting method using deepSI. To represent the input contribution in the lifted model, we consider an input-affine formulation. The chosen model structure is:
where is the lifted state, is the input, is the model output and is the measured system output. In the proposed model (19), is the Koopman matrix and is a nonlinear function of the lifted state. We construct the model using a linear output map such that the outputs of the original model are spanned by the lifting functions. If state measurements are available, we can also enforce that the states can be recovered by a linear mapping. The neural network is constructed using the encoder function and the nonlinear map (both implemented using feedforward neural nets), and the linear maps and . The subscript represents the parameters (weights and biases) of the neural network. The main advantage of using the Koopman model structure (19a) is that it can be viewed as a Linear Parameter Varying (LPV) system, for which numerous control techniques have been developed (see ).
The orders and and their selection corresponds to the classical problem of model structure selection in system identification and hence it is out of the scope of the current paper. Furthermore, the proposed network architecture can also handle full state measurements, in this case the output function being an identity function.
Iii-C Cost function and estimation procedure
The computation of the simulated response and the corresponding gradient for ANN optimization has a heavy computational cost, which can become intractable when large data sets are used. To deal with this, a trade-off proposed in ,  is to construct a cost function using subsections of the data set and, starting from an index , perform a -step ahead prediction. The cost function is formulated as follows:
where is computed through recursive iterations of (19a) starting from . The initial lifting to is performed through the encoder function as follows:
where estimates the inverse of the constructability map, using past input-output data to determine the lifted state . The notation represents the sets of past outputs and inputs. In the case of full state availability, for numerical reasons, the initial lifted state is computed via the encoder function based on the previous time step of the original state and input (21a).
Iii-D Batch optimization
As detailed in , eq. (20a) allows for the parallelization of the computations, allowing for the cost of each section to be computed individually. As such, the computation time is greatly decreased and, moreover, a batch cost function can be utilized, summing over only a subset of sections:
with . This reformulation offers the possibility of using powerful optimization algorithms such as Adam .
Iv Experiments and results
We demonstrate the performance of the proposed method on the simulation example of an autonomous Van der Pol oscillator with full state measurements and the Silverbox benchmark system, which is a real-world setup with only input-output data available.
Iv-a Van der Pol
We consider the dynamics of an unforced Van der Pol oscillator :
with . The continuous time system is discretized using the Runge-Kutta 4 numerical formula with a sampling frequency of
Hz. Training, validation and test trajectories are generated starting from initial conditions that are uniformly distributed, each trajectory having a length of 501 data points. White Gaussian noise
is added to the simulated state trajectories such that a Signal-to-Noise Ratio (SNR) of 20 dB is achieved per each individual channel (note that the test data is noiseless). For training, 80 sets of trajectories, for validation, 20 sets and, for testing, 10 sets are generated.
The lifting function given by (21a) (without the input term) is implemented as a feedforward neural network, with 1 hidden layer, 100 nodes and tanh activation. The parameters are initialized by sampling from a uniform distribution , with , and represents the number of inputs to the layer. We consider a lifting dimension , a prediction horizon value and a batch size of 256. For training, Adam batch optimization  is used, with a learning rate of and the exponential decay rates set to: and . We utilize early stopping, as described in 
, by computing the simulation error on the validation data set after each epoch. We then select the parameters for which the validation cost is minimal. After the training phase, the model along the epochs with the lowest simulation error is used for analysis.
Fig. 2 shows a set of noisy time-domain trajectories used as training data. Fig. 3 depicts one realization of the noiseless test set and the simulated state trajectories of the identified Koopman model, alongside the residuals. We observe that the proposed Koopman-based encoder is able to capture the oscillating dynamics of the test system with acceptable simulation error. This behavior can also be observed in the phase portrait depicted in Fig. 4. The quality of the model is assessed in terms of the Normalized Root Mean Square (NRMS) and RMS errors:
where the total RMS error is computed as the mean of the RMS error per section of data and
is the standard deviation of all test outputs.is the total length of a section of test data and is the starting point () for the input-output case and for the full state availability case). In terms of this error measure, the following results are obtained on the test data by the estimated Koopman model:
where the given error measures are the mean NRMS and RMS over the two state trajectories. These errors can be mostly attributed to the mismatch at the peaks of the sharp rising slopes, as seen in Fig. 2. However, the identified linear system based on the full state Koopman encoder is able to represent the nonlinear dynamics of the Van der Pol oscillator, successfully recovering the limit cycle. The results are quite satisfactory given that noisy training and validation data sets with noise (in terms of power) are used.
To illustrate the capabilities of the proposed Koopman encoder structure when no state measurements are available, we use measured data from the Silverbox benchmark , which is a real-world electrical implementation of a mass-spring-damper system with a cubic spring, similar to the forced Duffing oscillator. The first part of the input signal is a filtered Gaussian noise sequence with linearly increasing amplitude, and the rest is generated as a multisine signal. Fig. 5 shows the separation of data, with the note that both the multisine and filtered Gaussian (arrowhead part) are used for testing and assessing the quality of the identified model.
For this experiment, both encoder (21b) and nonlinear input function
are implemented through feedforward neural networks, the former with 2 hidden layers and the latter with 1, each having 40 neurons per layer. The encoder settings are:, , and a batch size of 256 is used. The initial parameters are sampled from a uniform distribution as detailed in the Van der Pol example and the same Adam batch optimization algorithm is used. The learning rate is set to and the exponential decay rates are chosen as and .
As illustrated in Fig. 6, the identified model can accurately represent the dynamics of the original system, when a multisine test signal is applied. However, when the arrowhead test input data is applied, the error highly increases towards the end of the simulation, as Fig. 6 depicts. The problem in the extrapolation region is due to a mismatch between the representation of the nonlinearity in the model and the true polynomial nonlinearity. Methods which explicitly use polynomial basis may perform better in this regard . Table I presents the NRMS and RMS error values. If we discard the extrapolation region of the arrowhead test signal, the obtained errors are comparable to the state of the art .
|Arrowhead - no extrapol.||0.00811||0.00033|
The present paper formulates a Koopman identification method as a nonlinear identification problem, using a neural network estimator consistent with the inverse of the constructability map. Furthermore, the effect of the input is accounted for in the Koopman model through an input-affine description. We have shown that this approach can successfully capture the dynamics of the underlying nonlinear system through motivating examples, for both full and partial state availability.
-  Schoukens, J. and Ljung, L., “Nonlinear System Identification: A User-Oriented Roadmap,” IEEE Control Systems Magazine, Volume 39, no. 6, pp. 28-99, 2019
-  Beintema, G., Tóth, R. and Schoukens., M., “Nonlinear state-space identification using deep encoder networks,” Proceedings of Learning for Dynamics and Control (L4DC), 2021
-  Beintema, G., Tóth, R. and Schoukens., M., “Non-linear State-space Model Identification from Video Data using Deep Encoders,” 19th IFAC Symposium on System Identification (SYSID), 2021
-  Mauroy, A., Mezić, I. and Susuki, Y., The Koopman Operator in Systems and Control, Springer International Publishing, 2020
-  Bonnert, M. and Konigorski, U., ”Estimating Koopman Invariant Subspaces of Excited Systems Using Artificial Neural Networks,” IFAC-PapersOnLine, Volume 53, Issue 2, pp. 1156-1162, 2020
Lusch, B., Kutz, J.N. and Brunton, S.L., ”Deep learning for universal linear embeddings of nonlinear dynamics,”Nature Communications, Volume 9, Article no. 4950, 2018
-  Heijden, B.V.D., Ferranti, L., Kober, J. and Babuška, R., “DeepKoCo: Efficient latent planning with an invariant Koopman representation,” arXiv preprint, arXiv:2011.12690, 2020
-  Otto, S.E. and Rowley, C.W., ”Linearly Recurrent Autoencoder Networks for Learning Dynamics”, SIAM Journal on Applied Dynamical Systems, Volume 18, Issue 1, pp. 558-593, 2019
-  Yeung, E., Kundu, S. and Hodas, N.O., “Learning Deep Neural Network Representations for Koopman Operators of Nonlinear Dynamical Systems,” American Control Conference (ACC), pp. 2832-4839, 2019
-  Wigren, T. and Schoukens, J., ”Three free data sets for development and benchmarking in nonlinear system identification,” European Control Conference (ECC), pp. 2933-2938, 2013
-  Mauroy, A. and Goncalves, J., ”Koopman-Based Lifting Techniques for Nonlinear Systems Identification,” IEEE Transactions on Automatic Control, Volume 65, no. 6, pp. 2550-2565, 2020
-  Rowley, C., Mezić, I., Bagheri, S., Schlatter, P. and Henningson, D., ”Spectral analysis of nonlinear flows,” Journal of Fluid Mechanics, Volume 641, pp. 115-127, 2009
-  Williams, M.O., Kevrekidis, I.G. and Rowley, C.W., ”A Data–Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition,” Journal of Nonlinear Science, Volume 25, Issue 6, pp. 1307-1346, 2015
-  Brunton, S.L., Brunton, B.W., Proctor, J.L. and Kutz, J.N., ”Koopman Invariant Subspaces and Finite Linear Representations of Nonlinear Dynamical Systems for Control,” The Public Library of Science ONE, Volume 11, Issue 2, 2016
-  Brunton, S., Budisic, M., Kaiser, E. and Kutz, J., “Modern Koopman Theory for Dynamical Systems,” arXiv preprint, arXiv:2102.12086, 2021
-  Korda, M. and Mezic, I., “Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control,” Automatica, Volume 93, pp. 149-160, 2018
-  Khalil, H.K., Nonlinear Systems, Third Edition, Prentice Hall, 2002
-  Nijmeijer, H., ”Observability of autonomous discrete time non-linear systems: a geometric approach,” International Journal of Control, Volume 36, no. 5, pp. 867-874, 1982
-  Kingma, D.P. and Ba, J., ”Adam: A Method for Stochastic Optimization,” International Conference on Learning Representations (ICLR), 2015
-  Takeishi, N., Kawahara, Y. and Yairi, T., ”Learning Koopman Invariant Subspaces for Dynamic Mode Decomposition,” International Conference on Neural Information Processing Systems (NIPS), 2017
-  Mohammadpour, J. Scherer and C.W., Control of Linear Parameter Varying Systems with Applications, Springer-Verlag New York, 2012
-  Surana, A., ”Koopman Operator Based Observer Synthesis for Control-Affine Nonlinear Systems,” IEEE 55th Conference on Decision and Control (CDC), pp.6492-6499, 2016