1 Introduction
Machine learning tools such as neural networks (NNs) are becoming ones of the standard tools for modeling control systems. However, as a general problem, system properties such as stability, controllability, and stabilizability are not inherited by learned models. In other words, there still remains a question: how to implement preknown systems properties as prior information of learning. In this context, there are recent approaches to learning stable autonomous dynamics by NNs [3, 7]. In [3]
, a term forcing stability has been included in a loss function, and states not satisfying a stability condition have been added to learning data in each iteration. In
[7], an NN parametrization of stable system dynamics has been proposed by simultaneously modeling system dynamics and a Lyapounov function. By this approach, it has been explicitly guaranteed that the origin of modeled system dynamics are globally exponentially stable for all possible NN parameters.In this paper, beyond autonomous dynamics, we consider control
dynamics and provide an NN parametrization of a stabilizable drift vector field when an input vector field is given. The proposed parametrization theoretically guarantees that the learned drift vector field is stabilizable. Moreover, as a byproduct of the proposed approach, we can learn not only a drift vector field but also a stabilizing controller and a Lyapuonv function of the closedloop system, i.e., a control Lyapunov function
[13]. These are utilized to analyze when the true system is stabilized by the learned controller.The proposed learning method has freedom for tuning parameters, which can be available to solve various control problems in addition to learning a stabilizable drift vector field. This is illustrated by applying our method for solving nonlinear control problems and HamiltonJacobi inequalities (HJIs). This further suggests a way to modify a loss function for learning in order to solve HamiltonJacobi equations. In addition, establishing a bridge between our approach and HJIs gives a new look at the conventional method [7] for stable autonomous dynamics. From an inverse optimal control perspective [8], it is possible to show that the learning formula by [7] is optimal in some sense. This is also true for our method.
In a related attempt, modeling based on the HammersteinWiener model has been done in [10]
. However, its internal dynamics have been limited to linear, and the model has not been structurally guaranteed to be stabilizable. Variances of
[3, 7] are found for studying different classes of stable autonomous dynamics such as stochastic systems [9, 15], monotone systems [16], timedelay systems [12], and systems admitting positively invariant sets [14]. However, none of them considers control design. Differently from datadriven control for nonlinear systems, e.g., [1, 11], a stabilizable drift vector field, stabilizing controller, and Lyapunov function are learned at once, only by specifying them into NNs which have a lot of flexibility to describe nonlinearity.The remainder of this paper is organized as follows. In Section 2, the learning problem of stabilizable unknown drift vector fields is formally stated. Then, as a preliminary step, we review the conventional work [7] for learning stable autonomous systems. In Section 3, as the main result, we present a novel method for simultaneously learning stabilizable dynamics, a stabilizing controller, and a Lyapuonv function of the closedloop system, which are further exemplified in Section 4. Concluding remarks are given in Section 5.
Notation: Let and be the field of real numbers and the set of nonnegative real numbers, respectively. For a vector or a matrix, denotes its Euclidean norm or its induced Euclidean norm, respectively. For a continuously differentiable function , the row vectorvalued function consisting of its partial derivatives is denoted by . Similarly, the column vectorvalued function is denoted by . Furthermore, its Lie derivative along a vector field is denoted by . More generally, for a matrixvalued function .
2 Preliminaries
2.1 Problem Formulation
Consider inputaffine nonlinear systems, described by
(1) 
where and are locally Lipschitz continuous, and .
Our goal in this paper is to design a stabilizing controller for the system (1) when the drift vector field is unknown, stated below.
Problem 1
For the system (1), suppose that

is unknown, and is known;

for some inputdata , the corresponding output data are available.
From those available data, learn a stabilizing controller together with the drift vector field .
In Problem 1, we assume that is measurable, which is not an essential requirement. If is sampled evenly, one can apply a differential approximation method for computing [4]. In the uneven case, one can utilize the adjoint method [5].
To guarantee the solvability of the problem, we suppose that the system (1) is stabilizable in the following sense.
Definition 2
The system (1) is said to be (globally) stabilizable if there exist a scalarvalued function and a locally Lipschitz continuous function such that

is continuously differentiable;

is positive definite on , i.e., for all , and if and only if ;

is radially unbounded, i.e., as ;

it follows that
(2) for all .
The function is nothing but a Lyapunov function of the closedloop system , which guarantees the global asymptotical stability (GAS) at the origin. In other words, is a control Lyapunov function (CLF) [13]. If a CLF is found, it is known that one can construct the following Sontagtype stabilizing controller [13]:
(3)  
This is one of the well known controllers for nonlinear control and is investigated from the various aspect such as the inverse optimality; see. e.g., [8].
In this paper, we simultaneously learn a drift vector field , CLF , and stabilizing controller . One can further construct the Sontagtype controller (3) from the CLF and employ it instead of learned .
2.2 Learning stable autonomous dynamics
A neural network (NN) algorithm for learning stable autonomous dynamics has been proposed by [7]. An important feature of this algorithm is that the global exponential stability (GES) of the learned dynamics is guaranteed theoretically. In this subsection, we summarize this algorithm as a preliminary step of solving Problem 1.
Consider the following autonomous systems:
(4) 
Suppose that the origin is GES. Then, it is expected that there exists a Lyapunov function satisfying the following three:

is continuously differentiable;

there exist such that for all ;

there exists such that for all .
This is true if is continuous and bounded on ; see, e.g., [6, Theorem 4.14].
To learn unknown stable dynamics (4) by deep learning, we introduce two NNs. Let and denote NNs corresponding to a nominal drift vector field and Lyapunov function, respectively. By nominal, we emphasize that itself does not represent learned stable dynamics, and is learned as a projection of onto a set of stable dynamics.
First, we specify the structure of such that items 1) and 2) hold for arbitrary parameters of the NN. Define
(5) 
where is given. The function is an inputconvex neural network (ICNN) [2], described by
(6) 
where , and , represent the weights of the mappings from to the th layer and from to the th layer, respectively, and , represent the bias functions of the
th layer. Finally, the activate functions
,are the following smooth ReLU functions:
(7) 
for some fixed , . It has been shown by [7, Theorem 1] that constructed by (5)–(7) satisfies items 1) and 2) on for arbitrary parameters , , , and .
Next, we consider item 3). Let be locally Lipschitz continuous on and satisfy . One can confirm that the following satisfies item 3) for arbitrary and :
(8)  
Since is locally Lipschitz continuous, and if and only if , is locally Lipschitz continuous on , and ; this has not been explicitly mentioned by [7, Theorem 1].
3 Learning stabilizing controllers
3.1 Main results
Inspired by Algorithm 1, we present an algorithm for solving Problem 1. Our approach is to learn a drift vector field and a controller such that the GAS of the closedloop system is guaranteed theoretically.
To this end, we again employ and and newly introduce an NN representing a controller, . Recall that constructed by (5) – (7) satisfies items 1) – 3) of Definition 2. Therefore, the remaining requirement is item 4), which holds for arbitrary , , and if is learned by
(12)  
where is a given locally Lipschitz continuous positive definite function. The formula (12) can be viewed as a projection of onto a stabilizable drift vector field for given . Indeed, stabillizes the learned , stated below.
Theorem 3
As mentioned above, satisfies items 1) – 3) of Definition 2. By a similar reasoning mentioned in the previous subsection, is locally Lipschitz continuous on and satisfies .
Next, it follows from (12) that
(13) 
Therefore, the system is GAS at the origin. Finally, the origin is GES if , , since there exist such that as mentioned above.
Remark 4
If one only requires the GAS of the closedloop system, the activate functions , are not needed to be smooth ReLU functions (7). Because of the term in (5), satisfies items 1) – 3) of Definition 2 if , are continuously differentiable, and is positive semidefinite. Moreover, , can be selected as vectorvalued functions, which is also true in the GES case.
One notices that the learning formula (12) of does not depend on . Therefore, training data of can be generated from the trajectory of . In other words, we only have to choose , in Problem 1 and to employ the loss function (9). The proposed learning algorithm is summarized in Algorithm 2 below, where we again use the following compact description of (12):
(14) 
At the end of this subsection, we take the learning error of into account. Let denote the true drift vector field, where is the one learned by Algorithm 2. As expected, if the learning error is small, then the learned controller stabilizes the true system also, stated below.
Theorem 5
Let us use the same notations as Theorem 3, and let denote the level set of , i.e., . Also, define the following set :
If there exists such that , then is a region of attraction for the true closedloop system .
3.2 Applications to control
In Algorithm 2, there are freedoms for structures and . Utilizing them, one can impose some control performances in addition to the closedloop stability. To illustrate this, we apply Algorithm 2 to designing an controller.
Consider the following system:
(15) 
where and denote the disturbance and performance output, respectively. The functions and are locally Lipschitz continuous, and .
We consider designing a feedback controller such that for a given , the closedloop system satisfies
(16) 
when . According to [8], this control problem is solvable if there exists a continuously differentiable positive definite function such that
Then, from (3.1), one only has to choose in Algorithm 2 such that
(17) 
If is positive definite, the closedloop stability is also guaranteed when .
3.3 Applications to HamiltonJacobi inequalities
As a byproduct of Algorithm 2, a Lyapunov function of the closedloop system is also learned. We apply this fact for solving the following HamiltonJacobi inequality (HJI):
(18) 
with respect to for given and , where is symmetric and positive definite for all .
One can solve the HJI by specifying the structure of in Algorithm 2 into
(19) 
Indeed, it follows from (3.1) and (19) that
From the above, one may also notice that the HamiltonJacobi equation (HJE), , can be solved approximately by making small. This, for instance, can be done by replacing the loss function (9) with
(20) 
where is the weight. From standard arguments of optimal control [8], this further implies that for in (19) is an approximate solution to the following optimal control problem:
(21)  
That is, Algorithm 2 can also be employed for solving the optimal control problem (21) approximately.
3.4 Revisiting learning stable autonomous dynamics
In the previous subsection, we have established a bridge between Algorithm 2 and optimal control. In fact, an optimal control perspective gives a new look at the formula (8) for learning stable autonomous dynamics.
Theorem 6
For arbitrary of locally Lipschitz on and of class , it follows that
for
(22) 
where is arbitrary. Moreover, in (8) satisfies when .
The above theorem and discussion in the previous subsection imply that when , the controller with in (8) is an optimal controller of
for in (22) and defined by
That is, the learning formula (8) is optimal in this sense. A similar remark holds for the learning formula (12) of stabilizing control design.
4 Examples
In this section, we illustrate Algorithm 2. As system dynamics, we consider the following van der Pol oscillator:
(23)  
When , this system has the stable limit cycle, and thus the origin is unstable. If the drift vector field is known, this system is stabilizable by specifying .
For stabilizing control design, training data points are equally distributed on , and the number of training data is . We choose of the Lyapunov candidate (5) as , and in (12) as . A parameter in Algorithm 2 is selected as
. For optimization, the adaptive moment estimation (Adam) is employed.
Figure 2 shows the phase portrait of the learned dynamics . As shown in Fig. 2, the learned dynamics have a stable limit cycle. Thus, Algorithm 2 preserves a topological property of the true dynamics. Also, we plot the learned Lyapunov function and controller in Figs. 5 and 5, respectively. It can be confirmed that is positive definite. This and imply that the learned controller is a stabilizing controller for the learned drift vector field . Since is a CLF, the Sontagtype controller (3) can also be constructed, which is plotted in Fig. 5.
As confirmed by Fig. 7, the learned controller stabilizes the learned dynamics . We also apply this controller to the true dynamics . According to Fig. 7, the true system is also stabilized. However, the conditions in Theorem 5 do not hold at of data points.
To make the conditions in Theorem 5 hold, we select the parameters of Algorithm 2 as and . In this case, the learned , , and by Algorithm 2 satisfy the conditions. Thus, it is guaranteed that is a stabilizing controller of the true dynamics .
We plot the phase portrait of the learned dynamics in Fig. 9 which looks similar to Fig. 2. Also, the learned dynamics again preserve the limit cycle as confirmed by Fig. 9. Therefore, the learned drift vector fields are not sensitive with respect to the parameters of Algorithm 2 at least in this example. Next, we plot the learned Lyapunov function and controller in Figs. 11 and 11, respectively. The new in Fig. 11 is much larger than the previous one in Fig. 5. According to Figs. 5 and 11, the reason can be to increase the convergence speed of the direction, which can be due to the conservativeness of Theorem 5. Future work includes to derive a less conservative condition for the closedloop stability of the true dynamics.
5 Conclusion
In this paper, we have developed an algorithm for learning stabilizable dynamics. We have theoretically guaranteed that the learned dynamics are stabilizable by simultaneously learning stabilizing controllers and Lyapunov functions of the closedloop systems. It is expected that the proposed algorithm can be applied to various control problems as partly illustrated by control and optimal control. Furthermore, the proposed method can be extended to learning dynamics that can be made dissipative with respect to an arbitrary supply rate by control design and to find control barrier functions for safety control, which will be reported in future publication.
References
 [1] (2021) Databased system analysis and control of flat nonlinear systems. arXiv:2103.02892. Cited by: §1.
 [2] (2017) Input convex neural networks. In International Conference on Machine Learning, pp. 146–155. Cited by: §2.2.
 [3] (2019) Neural Lyapunov control. Advances in Neural Information Processing Systems 32. Cited by: §1, §1.
 [4] (2011) Numerical differentiation of noisy, nonsmooth data. International Scholarly Research Notices. Cited by: §2.1.
 [5] (2018) . Advances in Neural Information Processing Systems 31. Cited by: §2.1.
 [6] (2002) Nonlinear systems. third edition, Prentice Hall. Cited by: §2.2.
 [7] (2019) Learning stable deep dynamics models. Advances in Neural Information Processing Systems 32, pp. 11128–11136. Cited by: §1, §1, §1, §1, §2.2, §2.2, §2.2.
 [8] (1998) Stabilization of Nonlinear Uncertain Systems. Springer. Cited by: §1, §2.1, §3.2, §3.3, §3.4.
 [9] (2020) Almost surely stable deep dynamics. Advances in Neural Information Processing Systems 33, pp. 18942–18953. Cited by: §1.
 [10] (2021) Structured HammersteinWiener model learning for model predictive control. IEEE Control Systems Letters 6, pp. 397–402. Cited by: §1.
 [11] (2020) Datadriven internal model control of secondorder discrete volterra systems. Proc. 59th IEEE Conference on Decision and Control, pp. 4572–4579. Cited by: §1.
 [12] (2021) Learning stable deep dynamics models for partially observed or delayed dynamical systems. Advances in Neural Information Processing Systems 34. Cited by: §1.
 [13] (1989) A ‘universal’construction of Artstein’s theorem on nonlinear stabilization. Systems & Control Letters 13 (2), pp. 117–123. Cited by: §1, §2.1.
 [14] (2020) Learning dynamics models with stable invariant sets. arXiv:2006.08935. Cited by: §1.
 [15] (2020) Imitationflow: learning deep stable stochastic dynamic systems by normalizing flows. Proc. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5231–5237. Cited by: §1.
 [16] (2020) Deep learning for stable monotone dynamical systems. arXiv:2006.06417. Cited by: §1.