DeepAI
Log In Sign Up

Learning Stabilizable Deep Dynamics Models

When neural networks are used to model dynamics, properties such as stability of the dynamics are generally not guaranteed. In contrast, there is a recent method for learning the dynamics of autonomous systems that guarantees global exponential stability using neural networks. In this paper, we propose a new method for learning the dynamics of input-affine control systems. An important feature is that a stabilizing controller and control Lyapunov function of the learned model are obtained as well. Moreover, the proposed method can also be applied to solving Hamilton-Jacobi inequalities. The usefulness of the proposed method is examined through numerical examples.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

01/20/2022

TOAST: Trajectory Optimization and Simultaneous Tracking using Shared Neural Network Dynamics

Neural networks have been increasingly employed in Model Predictive Cont...
06/16/2020

Learning Dynamics Models with Stable Invariant Sets

Stable invariant sets are an essential notion in the analysis and applic...
03/18/2019

A Control Lyapunov Perspective on Episodic Learning via Projection to State Stability

The goal of this paper is to understand the impact of learning on contro...
04/08/2011

Gaussian Affine Feature Detector

A new method is proposed to get image features' geometric information. U...
06/11/2020

Deep Differential System Stability – Learning advanced computations from examples

Can advanced mathematical computations be learned from examples? Using t...
06/02/2021

Smooth Q-learning: Accelerate Convergence of Q-learning Using Similarity

An improvement of Q-learning is proposed in this paper. It is different ...

1 Introduction

Machine learning tools such as neural networks (NNs) are becoming ones of the standard tools for modeling control systems. However, as a general problem, system properties such as stability, controllability, and stabilizability are not inherited by learned models. In other words, there still remains a question: how to implement pre-known systems properties as prior information of learning. In this context, there are recent approaches to learning stable autonomous dynamics by NNs [3, 7]. In [3]

, a term forcing stability has been included in a loss function, and states not satisfying a stability condition have been added to learning data in each iteration. In

[7], an NN parametrization of stable system dynamics has been proposed by simultaneously modeling system dynamics and a Lyapounov function. By this approach, it has been explicitly guaranteed that the origin of modeled system dynamics are globally exponentially stable for all possible NN parameters.

In this paper, beyond autonomous dynamics, we consider control

dynamics and provide an NN parametrization of a stabilizable drift vector field when an input vector field is given. The proposed parametrization theoretically guarantees that the learned drift vector field is stabilizable. Moreover, as a byproduct of the proposed approach, we can learn not only a drift vector field but also a stabilizing controller and a Lyapuonv function of the closed-loop system, i.e., a control Lyapunov function 

[13]. These are utilized to analyze when the true system is stabilized by the learned controller.

The proposed learning method has freedom for tuning parameters, which can be available to solve various control problems in addition to learning a stabilizable drift vector field. This is illustrated by applying our method for solving nonlinear -control problems and Hamilton-Jacobi inequalities (HJIs). This further suggests a way to modify a loss function for learning in order to solve Hamilton-Jacobi equations. In addition, establishing a bridge between our approach and HJIs gives a new look at the conventional method [7] for stable autonomous dynamics. From an inverse optimal control perspective [8], it is possible to show that the learning formula by [7] is optimal in some sense. This is also true for our method.

In a related attempt, modeling based on the Hammerstein-Wiener model has been done in [10]

. However, its internal dynamics have been limited to linear, and the model has not been structurally guaranteed to be stabilizable. Variances of

[3, 7] are found for studying different classes of stable autonomous dynamics such as stochastic systems [9, 15], monotone systems [16], time-delay systems [12], and systems admitting positively invariant sets [14]. However, none of them considers control design. Differently from data-driven control for nonlinear systems, e.g., [1, 11], a stabilizable drift vector field, stabilizing controller, and Lyapunov function are learned at once, only by specifying them into NNs which have a lot of flexibility to describe nonlinearity.

The remainder of this paper is organized as follows. In Section 2, the learning problem of stabilizable unknown drift vector fields is formally stated. Then, as a preliminary step, we review the conventional work [7] for learning stable autonomous systems. In Section 3, as the main result, we present a novel method for simultaneously learning stabilizable dynamics, a stabilizing controller, and a Lyapuonv function of the closed-loop system, which are further exemplified in Section 4. Concluding remarks are given in Section 5.

Notation: Let and be the field of real numbers and the set of non-negative real numbers, respectively. For a vector or a matrix, denotes its Euclidean norm or its induced Euclidean norm, respectively. For a continuously differentiable function , the row vector-valued function consisting of its partial derivatives is denoted by . Similarly, the column vector-valued function is denoted by . Furthermore, its Lie derivative along a vector field is denoted by . More generally, for a matrix-valued function .

2 Preliminaries

2.1 Problem Formulation

Consider input-affine nonlinear systems, described by

(1)

where and are locally Lipschitz continuous, and .

Our goal in this paper is to design a stabilizing controller for the system (1) when the drift vector field is unknown, stated below.

Problem 1

For the system (1), suppose that

  • is unknown, and is known;

  • for some input-data , the corresponding output data are available.

From those available data, learn a stabilizing controller together with the drift vector field .

In Problem 1, we assume that is measurable, which is not an essential requirement. If is sampled evenly, one can apply a differential approximation method for computing [4]. In the uneven case, one can utilize the adjoint method [5].

To guarantee the solvability of the problem, we suppose that the system (1) is stabilizable in the following sense.

Definition 2

The system (1) is said to be (globally) stabilizable if there exist a scalar-valued function and a locally Lipschitz continuous function such that

  1. is continuously differentiable;

  2. is positive definite on , i.e., for all , and if and only if ;

  3. is radially unbounded, i.e., as ;

  4. it follows that

    (2)

    for all .

The function is nothing but a Lyapunov function of the closed-loop system , which guarantees the global asymptotical stability (GAS) at the origin. In other words, is a control Lyapunov function (CLF) [13]. If a CLF is found, it is known that one can construct the following Sontag-type stabilizing controller [13]:

(3)

This is one of the well known controllers for nonlinear control and is investigated from the various aspect such as the inverse optimality; see. e.g., [8].

In this paper, we simultaneously learn a drift vector field , CLF , and stabilizing controller . One can further construct the Sontag-type controller (3) from the CLF and employ it instead of learned .

2.2 Learning stable autonomous dynamics

A neural network (NN) algorithm for learning stable autonomous dynamics has been proposed by [7]. An important feature of this algorithm is that the global exponential stability (GES) of the learned dynamics is guaranteed theoretically. In this subsection, we summarize this algorithm as a preliminary step of solving Problem 1.

Consider the following autonomous systems:

(4)

Suppose that the origin is GES. Then, it is expected that there exists a Lyapunov function satisfying the following three:

  1. is continuously differentiable;

  2. there exist such that for all ;

  3. there exists such that for all .

This is true if is continuous and bounded on ; see, e.g., [6, Theorem 4.14].

To learn unknown stable dynamics (4) by deep learning, we introduce two NNs. Let and denote NNs corresponding to a nominal drift vector field and Lyapunov function, respectively. By nominal, we emphasize that itself does not represent learned stable dynamics, and is learned as a projection of onto a set of stable dynamics.

First, we specify the structure of such that items 1) and 2) hold for arbitrary parameters of the NN. Define

(5)

where is given. The function is an input-convex neural network (ICNN) [2], described by

(6)

where , and , represent the weights of the mappings from to the th layer and from to the th layer, respectively, and , represent the bias functions of the

th layer. Finally, the activate functions

,

are the following smooth ReLU functions:

(7)

for some fixed , . It has been shown by [7, Theorem 1] that constructed by (5)–(7) satisfies items 1) and 2) on for arbitrary parameters , , , and .

Next, we consider item 3). Let be locally Lipschitz continuous on and satisfy . One can confirm that the following satisfies item 3) for arbitrary and :

(8)

Since is locally Lipschitz continuous, and if and only if , is locally Lipschitz continuous on , and ; this has not been explicitly mentioned by [7, Theorem 1].

Finally, to learn that fits to data , we use the following loss function, for some ,

(9)

where is the cardinality of . The learning algorithm is summarized in Algorithm 1, where we use the following compact description of (8) although the definition at the origin becomes vague:

(10)

where

(11)
, , ,
and
Initialize:
     
repeat
     select
     
     ,
     ,
     ,
until convergence
return and
Algorithm 1 Learning Deep Stable Dynamics

3 Learning stabilizing controllers

3.1 Main results

Inspired by Algorithm 1, we present an algorithm for solving Problem 1. Our approach is to learn a drift vector field and a controller such that the GAS of the closed-loop system is guaranteed theoretically.

To this end, we again employ and and newly introduce an NN representing a controller, . Recall that constructed by (5) – (7) satisfies items 1) – 3) of Definition 2. Therefore, the remaining requirement is item 4), which holds for arbitrary , , and if is learned by

(12)

where is a given locally Lipschitz continuous positive definite function. The formula (12) can be viewed as a projection of onto a stabilizable drift vector field for given . Indeed, stabillizes the learned , stated below.

Theorem 3

Consider constructed by (5) – (7), and locally Lipschitz continuous and such that and . Also, let be locally Lipschitz continuous and positive definite. Then, for in (12), the closed-loop system is GAS (GES if , ) at the origin.

As mentioned above, satisfies items 1) – 3) of Definition 2. By a similar reasoning mentioned in the previous subsection, is locally Lipschitz continuous on and satisfies .

Next, it follows from (12) that

(13)

Therefore, the system is GAS at the origin. Finally, the origin is GES if , , since there exist such that as mentioned above.

Remark 4

If one only requires the GAS of the closed-loop system, the activate functions , are not needed to be smooth ReLU functions (7). Because of the term in (5), satisfies items 1) – 3) of Definition 2 if , are continuously differentiable, and is positive semi-definite. Moreover, , can be selected as vector-valued functions, which is also true in the GES case.

One notices that the learning formula (12) of does not depend on . Therefore, training data of can be generated from the trajectory of . In other words, we only have to choose , in Problem 1 and to employ the loss function (9). The proposed learning algorithm is summarized in Algorithm 2 below, where we again use the following compact description of (12):

(14)
, , , ,
, , and
Initialize:
     
repeat
     select
     
     ,
     ,
     ,
until convergence
return , , and
Algorithm 2 Learning Deep Stabilizing Controllers

At the end of this subsection, we take the learning error of into account. Let denote the true drift vector field, where is the one learned by Algorithm 2. As expected, if the learning error is small, then the learned controller stabilizes the true system also, stated below.

Theorem 5

Let us use the same notations as Theorem 3, and let denote the level set of , i.e., . Also, define the following set :

If there exists such that , then is a region of attraction for the true closed-loop system .

It follows form Theorem 3 that

Form the representation (5) – (7) of , it is possible to show

This yields

Therefore, on . The statement of the theorem follows from the fact that is the level set.

3.2 Applications to -control

In Algorithm 2, there are freedoms for structures and . Utilizing them, one can impose some control performances in addition to the closed-loop stability. To illustrate this, we apply Algorithm 2 to designing an -controller.

Consider the following system:

(15)

where and denote the disturbance and performance output, respectively. The functions and are locally Lipschitz continuous, and .

We consider designing a feedback controller such that for a given , the closed-loop system satisfies

(16)

when . According to [8], this -control problem is solvable if there exists a continuously differentiable positive definite function such that

Then, from (3.1), one only has to choose in Algorithm 2 such that

(17)

If is positive definite, the closed-loop stability is also guaranteed when .

3.3 Applications to Hamilton-Jacobi inequalities

As a byproduct of Algorithm 2, a Lyapunov function of the closed-loop system is also learned. We apply this fact for solving the following Hamilton-Jacobi inequality (HJI):

(18)

with respect to for given and , where is symmetric and positive definite for all .

One can solve the HJI by specifying the structure of  in Algorithm 2 into

(19)

Indeed, it follows from (3.1) and (19) that

From the above, one may also notice that the Hamilton-Jacobi equation (HJE), , can be solved approximately by making small. This, for instance, can be done by replacing the loss function (9) with

(20)

where is the weight. From standard arguments of optimal control [8], this further implies that for in (19) is an approximate solution to the following optimal control problem:

(21)

That is, Algorithm 2 can also be employed for solving the optimal control problem (21) approximately.

3.4 Revisiting learning stable autonomous dynamics

In the previous subsection, we have established a bridge between Algorithm 2 and optimal control. In fact, an optimal control perspective gives a new look at the formula (8) for learning stable autonomous dynamics.

Inspired by inverse optimal control [8], we relate the formula (8) with an HJI (3.3).

Theorem 6

For arbitrary of locally Lipschitz on and of class , it follows that

for

(22)

where is arbitrary. Moreover, in (8) satisfies when .

It follows from (22) that

Next, it holds that

Thus, we have when .

The above theorem and discussion in the previous subsection imply that when , the controller with in (8) is an optimal controller of

for in (22) and defined by

That is, the learning formula (8) is optimal in this sense. A similar remark holds for the learning formula (12) of stabilizing control design.

4 Examples

In this section, we illustrate Algorithm 2. As system dynamics, we consider the following van der Pol oscillator:

(23)

When , this system has the stable limit cycle, and thus the origin is unstable. If the drift vector field  is known, this system is stabilizable by specifying .

For stabilizing control design, training data points are equally distributed on , and the number of training data is . We choose of the Lyapunov candidate (5) as , and in (12) as . A parameter in Algorithm 2 is selected as

. For optimization, the adaptive moment estimation (Adam) is employed.

Figure 2 shows the phase portrait of the learned dynamics . As shown in Fig. 2, the learned dynamics have a stable limit cycle. Thus, Algorithm 2 preserves a topological property of the true dynamics. Also, we plot the learned Lyapunov function and controller in Figs. 5 and 5, respectively. It can be confirmed that is positive definite. This and imply that the learned controller is a stabilizing controller for the learned drift vector field . Since is a CLF, the Sontag-type controller (3) can also be constructed, which is plotted in Fig. 5.

As confirmed by Fig. 7, the learned controller stabilizes the learned dynamics . We also apply this controller to the true dynamics . According to Fig. 7, the true system is also stabilized. However, the conditions in Theorem 5 do not hold at of data points.

Figure 1: Phase portrait of learned by Algorithm 2
Figure 2: Limit cycle of learned by Algorithm 2
Figure 3: Lyapunov function learned by Algorithm 2
Figure 4: Stabilizing controller learned by Algorithm 2

Figure 5: Sontag-type controller for learned by Algorithm 2
Figure 6: Phase portrait of the closed-loop system:
Figure 7: Phase portrait of the true closed-loop system:

To make the conditions in Theorem 5 hold, we select the parameters of Algorithm 2 as and . In this case, the learned , , and by Algorithm 2 satisfy the conditions. Thus, it is guaranteed that is a stabilizing controller of the true dynamics .

We plot the phase portrait of the learned dynamics in Fig. 9 which looks similar to Fig. 2. Also, the learned dynamics again preserve the limit cycle as confirmed by Fig. 9. Therefore, the learned drift vector fields are not sensitive with respect to the parameters of Algorithm 2 at least in this example. Next, we plot the learned Lyapunov function and controller in Figs. 11 and 11, respectively. The new in Fig. 11 is much larger than the previous one in Fig. 5. According to Figs. 5 and 11, the reason can be to increase the convergence speed of the -direction, which can be due to the conservativeness of Theorem 5. Future work includes to derive a less conservative condition for the closed-loop stability of the true dynamics.

Figure 8: Phase portrait of learned by Algorithm 2 such that the conditions in Theorem 5 hold
Figure 9: Limit cycle of learned by Algorithm 2 such that the conditions in Theorem 5 hold
Figure 10: Lyapunov function learned by Algorithm 2 such that the conditions in Theorem 5 hold
Figure 11: Stabilizing controller learned by Algorithm 2 such that the conditions in Theorem 5 hold

5 Conclusion

In this paper, we have developed an algorithm for learning stabilizable dynamics. We have theoretically guaranteed that the learned dynamics are stabilizable by simultaneously learning stabilizing controllers and Lyapunov functions of the closed-loop systems. It is expected that the proposed algorithm can be applied to various control problems as partly illustrated by -control and optimal control. Furthermore, the proposed method can be extended to learning dynamics that can be made dissipative with respect to an arbitrary supply rate by control design and to find control barrier functions for safety control, which will be reported in future publication.

References

  • [1] M. Alsalti, J. Berberich, V. G. Lopez, F. Allgöwer, and M. A. Müller (2021) Data-based system analysis and control of flat nonlinear systems. arXiv:2103.02892. Cited by: §1.
  • [2] B. Amos, L. Xu, and J. Z. Kolter (2017) Input convex neural networks. In International Conference on Machine Learning, pp. 146–155. Cited by: §2.2.
  • [3] Y. Chang, N. Roohi, and S. Gao (2019) Neural Lyapunov control. Advances in Neural Information Processing Systems 32. Cited by: §1, §1.
  • [4] R. Chartrand (2011) Numerical differentiation of noisy, nonsmooth data. International Scholarly Research Notices. Cited by: §2.1.
  • [5] R. T. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud (2018)

    Neural ordinary differential equations

    .
    Advances in Neural Information Processing Systems 31. Cited by: §2.1.
  • [6] H.K. Khalil (2002) Nonlinear systems. third edition, Prentice Hall. Cited by: §2.2.
  • [7] J. Z. Kolter and G. Manek (2019) Learning stable deep dynamics models. Advances in Neural Information Processing Systems 32, pp. 11128–11136. Cited by: §1, §1, §1, §1, §2.2, §2.2, §2.2.
  • [8] M. Krstic, H. Deng, et al. (1998) Stabilization of Nonlinear Uncertain Systems. Springer. Cited by: §1, §2.1, §3.2, §3.3, §3.4.
  • [9] N. Lawrence, P. Loewen, M. Forbes, J. Backstrom, and B. Gopaluni (2020) Almost surely stable deep dynamics. Advances in Neural Information Processing Systems 33, pp. 18942–18953. Cited by: §1.
  • [10] R. Moriyasu, T. Ikeda, S. Kawaguchi, and K. Kashima (2021) Structured Hammerstein-Wiener model learning for model predictive control. IEEE Control Systems Letters 6, pp. 397–402. Cited by: §1.
  • [11] J. G. Rueda-Escobedo and J. Schiffer (2020) Data-driven internal model control of second-order discrete volterra systems. Proc. 59th IEEE Conference on Decision and Control, pp. 4572–4579. Cited by: §1.
  • [12] A. Schlaginhaufen, P. Wenk, A. Krause, and F. Dorfler (2021) Learning stable deep dynamics models for partially observed or delayed dynamical systems. Advances in Neural Information Processing Systems 34. Cited by: §1.
  • [13] E. D. Sontag (1989) A ‘universal’construction of Artstein’s theorem on nonlinear stabilization. Systems & Control Letters 13 (2), pp. 117–123. Cited by: §1, §2.1.
  • [14] N. Takeishi and Y. Kawahara (2020) Learning dynamics models with stable invariant sets. arXiv:2006.08935. Cited by: §1.
  • [15] J. Urain, M. Ginesi, D. Tateo, and J. Peters (2020) Imitationflow: learning deep stable stochastic dynamic systems by normalizing flows. Proc. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5231–5237. Cited by: §1.
  • [16] Y. Wang, Q. Gao, and M. Pajic (2020) Deep learning for stable monotone dynamical systems. arXiv:2006.06417. Cited by: §1.