Repeatibility Evaluation for "Reachability of weakly nonlinear systems using Carleman linearization" (RP'21)
In this article we introduce a solution method for a special class of nonlinear initial-value problems using set-based propagation techniques. The novelty of the approach is that we employ a particular embedding (Carleman linearization) to leverage recent advances of high-dimensional reachability solvers for linear ordinary differential equations based on the support function. Using a global error bound for the Carleman linearization abstraction, we are able to describe the full set of behaviors of the system for sets of initial conditions and in dense time.READ FULL TEXT VIEW PDF
Repeatibility Evaluation for "Reachability of weakly nonlinear systems using Carleman linearization" (RP'21)
We consider the problem of solving a system of nonlinear ordinary differential equations (ODEs) for a set of initial states. This is better known as reachability analysis. While for linear systems there exist very efficient algorithms [GirardLGM06, LeGuernicG10, althoff2016combining, BogomolovFFVPS18], reachability analysis for nonlinear systems remains a challenging problem.
Traditional approaches [althoff2020set] include those based on Taylor models [ChenAS13], simulation [Donze10], or hybridization [LiBB20]. In this paper we present a new approach to this problem by transforming the nonlinear system into an infinite-dimensional linear system, which we then truncate. This truncated model approximates the original system.
More specifically, our approach is based on Carleman linearization, which is an established method in mathematical nonlinear control but differs from the above-mentioned approaches. The Taylor-model approach truncates an infinite Taylor polynomial, while we truncate a linear system. Hybridization approaches linearize smalls regions in the state space, while we linearize the whole system.
To achieve good accuracy, the truncation results in a high-dimensional linear system. To solve such systems, we leverage efficient reachability solvers based on the support function that have recently been developed.
Our approach can be used to obtain approximate solutions very quickly, but in an unsound way. Alternatively, using an error estimate, one can obtain a sound overapproximation. Under certain conditions (essentially weak nonlinearity), the error estimate converges, resulting in a precise approximation.
This paper makes the following contributions:
We revisit Carleman linearization and explain how it can be used as a fast but unsound way to propagate sets through a nonlinear dynamical system.
We extend the approach to a sound and practical reachability algorithm for dissipative nonlinear dynamical systems.
We evaluate the algorithm in two case studies and discuss its strengths.
The original idea by Carleman [carleman1932application, kowalski1991nonlinear] did not receive much attention for several decades. Steeb showed that, while the nonlinear system and its infinite-dimensional embedding share the same analytic solutions, the embedding may admit additional non-analytic solutions [Steeb89]. Carleman linearization has since been applied successfully in control theory [germani2005filtering, collado2008modified, mozyrska2008carleman, rauh2009carleman] and physics and chemistry [gaude2001solving, HashemianA15].
Several works provide bounds on the approximation error of the truncated linearized system [forets2017explicit, liu2021efficient]. In this paper we use the error bound derived in [liu2021efficient].
An approach that is related to ours transforms a nonlinear system into a linear or polynomial system via a “change of bases,” using polynomials instead of Kronecker powers, and derives conditions under which this transformation preserves invariants [Sankaranarayanan16].
The next section recalls the mathematical basis used in this paper. Section 3 introduces the classic Carleman linearization. In Section 4 we describe how to propagate sets using Carleman linearization. In Section 5 we extend this approach to a reachability algorithm for dissipative nonlinear dynamical systems. We evaluate the algorithm in Section 6 and conclude in Section 7.
In this section we summarize the mathematical prerequisites to make this paper self-contained. For a detailed derivation of the Carleman linearization procedure we refer to [forets2017explicit].
Let be the set of positive integers and the set of real numbers, and for any we let .
-dimensional vectorsare understood as column vectors with components , . Transposition is written . For any and , denotes the vector -norm of , with notable special cases (Euclidean norm) and (supremum norm). If is a matrix, denotes the matrix norm induced by the vector -norm, with notable special cases
(spectral norm: the largest singular value) and(supremum norm: the maximum absolute row sum). We may abbreviate for . See [horn2012matrix] for precise definitions of these concepts.
For a compact, i.e. bounded and closed set , denotes the maximum of over all . If is polytopic (i.e. admits a representation as the finite intersection of half-spaces), its norm can be computed by a finite number of vector -norm evaluations. Indeed, the map is convex and the maximum of a convex function over a polytope is attained at one of its vertices. However, computing the vertex representation of a polytope initially given by its half-space representation can be computationally expensive in dimensions higher than two (see [kaibel2003]). A simpler rule applies if is hyperrectangular (i.e., can be represented as an axis-aligned box with center and radius vector ). Then where is diagonal with matrix elements if and otherwise, . We write for the -dimensional infinity-norm ball with radius centered in the origin. The projection of a set to the first dimensions is denoted by .
A standard approach to operate with compact and convex sets in is to use the support function [LeGuernic09]. The support function of along direction , , is the maximum of over all . In particular, if
is a polytope in half-space representation, its support function can be computed by solving a linear program (LP), and for certain classes of sets analytic formulas exist, which can be numerically evaluated in an efficient way. Such cases include hyperrectangular sets. Since the support function distributes over Minkowski sums, i.e.for any pair of sets and , and since it holds that for any matrix , the support function has been successfully applied to solve linear set-based recurrences of the form , either explicitly or implicitly by solving the recurrence only along a predefined number of directions [LeGuernicG10, FrehseGDCRLRGDM11, BogomolovFFVPS18]. It is well-known that such recurrences are prevalent in reachability analysis of linear initial-value problems (IVPs), or nonlinear ones after some form of conservative linearization; see for example [althoff2020set] and references therein.
For any pair of vectors , , their Kronecker product is , and the dimension is . This product is not commutative. For matrices the definition is analogous: if and , then and
The Kronecker power of is a convenient notation to express all possible products of elements of a vector up to a given order:
Note that , and each component of is of the form for some multi-index , . For example, if , the first two Kronecker powers are and . Further properties of Kronecker products can be found in [zhang2011matrix] and [steeb2011matrix].
In this section we recall the classic Carleman linearization approach [carleman1932application, kowalski1991nonlinear].
Polynomial differential equations are an important class of nonlinear systems , , such that the coordinate functions are multivariate polynomials. Many systems can be rewritten as polynomial vector fields by introducing auxiliary variables, and any polynomial system is formally equivalent to a second-order system, possibly in higher dimensions – for a proof of this statement and an algorithm to compute such transformation see [forets2017explicit]. We can thus focus on quadratic ODEs without loss of generality. Consider the IVP for an -dimensional quadratic ODE,
with initial condition . Each , , is a function of over the interval where is the time horizon. We assume that the matrices and are independent of . Intuitively, (resp. ) is associated with the linear (resp. nonlinear) behavior of the dynamical system; thus being small corresponds to weak nonlinearity – a concept we will use in a later section.
The Carleman linearization (or Carleman embedding) procedure begins by introducing a sequence of auxiliary variables , . Differentiating such variables with respect to time, and repeatedly substituting (1) into the derivatives of each gives a formal equivalence with an infinite-dimensional linear system of ODEs [kowalski1991nonlinear]. Truncation to order leads to a finite linear IVP in the lifted variables , namely
with initial condition and coefficients matrix , which has the bi-diagonal block structure
where the and , which we call transfer matrices, have dimensions and respectively, and are defined by the formula
for all and where is either ( is placed on the main diagonal) or ( is placed on the upper diagonal) and where
is the identity matrix of order. Note also that and . The dimension of (2) is .
We illustrate the concepts described above in the simplest possible scenario. Consider the logistic equation (a special case of (1) for )
This equation and related generalizations arise naturally in the context of population dynamics, where controls the initial rate of exponential growth, and is the asymptotic equilibrium (the other equilibrium being ). We transform (4) into the canonical scalar form (1), namely , via and . Defining the auxiliary variables , , we see that their first-order derivatives satisfy , , etc. Hence the nonlinear ODE (4) is equivalent to the (infinite) linear ODE
If we now fix the truncation order , say, to , we obtain
To estimate the quality of the approximation by finite truncation, we plot the solutions of (4) from over a time horizon of and also plot the solution of (2) for several choices of in Figure ((a))(a). The model parameters are and . As can be seen, for increasing the solutions converge to the analytic solution, which in this case is known and given as
Solving Eq. (2) requires computing the matrix exponential acting on the initial states at all times, which may be expensive for higher-dimensional systems. In the next section we introduce a method to propagate sets of initial conditions in dense time making use of the particular structure of the matrix (3) using support function techniques. Theoretical estimates of the truncation error are considered in Section 5.
In the previous section we saw how to transform a nonlinear IVP into an approximate linear IVP by Carleman linearization and truncation at a chosen order . In this section we describe how this approach generalizes to IVPs whose initial condition is a set of states described by a hyperrectangle. This is a common case, and hyperrectangular approximations can be computed efficiently. We need to discuss two steps: how to transform to the linear system and how to propagate sets of states for a linear IVP.
For the transformation of we generalize the Kronecker product to sets with . For a hyperrectangle we approximate by applying the rules of interval arithmetic to each dimension [moore2009introduction]. We note that one needs to carefully arrange the variables in order to obtain a tight solution. The arrangement consists of grouping the same variables of each monomial; for example, is evaluated using interval arithmetic as to avoid the dependency problem. To illustrate, consider the extension of the example in Section 2.3 to the hyperrectangle . Then and .
There exist many algorithms to propagate a set through an IVP in a conservative way, i.e., the result overapproximates the true solution, in particular for linear IVPs [Girard05, GirardLGM06, HanK06, LeGuernicG10, kaynama2011complexity, althoff2016combining, bak2017simulation, BogomolovFFVPS18, althoff2020set]. Most of these approaches first discretize the continuous-time system, for which the error can be made arbitrarily small by choosing a small discretization step , and then propagate the sets in discrete time, which in certain cases can be done in an error-free way. We refer the reader to the above works for details about the discretization. Below we explain the second step because it is relevant for the later discussion.
Given a discretized linear IVP with discretized matrix and discretized initial condition ,
the set of reachable states is described by the flowpipe where the is the reach set for the time span . In other words, a flowpipe is a sequence of reach sets given by the matrix powers of applied to . This computation scales to systems with hundreds of dimensions.
Consider again the logistic system. In Figure ((b))(b) we plot the flowpipes obtained for the different truncated approximations with an initial condition .
In this section we discuss an error estimation that allows us to obtain a sound overapproximation of the states reachable by the original nonlinear system.
We have yet to determine how the solutions of the truncated linear IVP (2) are related to those of the original nonlinear IVP (1). To formulate this relation precisely, we introduce some notation. The error of the -th block of variables is defined as , which is the difference between the Kronecker power of the solution of (1) and the projection of the solution of (2) onto the corresponding block of variables of the lifted . We are mostly interested in the first block, i.e., , since , and the truncation error corresponds to upper bounding the quantity for some error function to be determined. Ideally, for fixed the error function should decrease sufficiently fast for increasing order , so we can use low orders in practice, typically 2 to 6. In [forets2017explicit] the authors derived explicit error bounds for the linearization, i.e., a function that only depends on the initial condition and the norms of the matrices and . However, that approach is too conservative since diverges in finite time – even in cases when the solution of the linearized system (2) is converging.
Crucial to the present article, the authors in [liu2021efficient] discovered that, by imposing an assumption on the class of quadratic problems considered, an arbitrary-time and exponentially convergent error formula holds. There are two main assumptions: 1) linear terms dominate over nonlinear ones (weak nonlinearity) and 2) nonlinear effects play a prominent role during a finite time span, after which only linear terms matter (linear dissipation
). These definitions are formalized below. In the following we assume that the eigenvalues ofin (1) are sorted (counting multiplicities) such that , where is the real part of .
System (1) is said to be weakly nonlinear if the ratio
System (1) is said to be dissipative if (i.e., the real part of all eigenvalues is negative).
The conditions and ensure arbitrary-time convergence.
The interesting aspect of (6) is that we can enclose all possible behaviors of a nonlinear problem for a hyperrectangular initial condition in two steps: first, propagating the solutions of the high-dimensional linear system (2) forward in time using a suitable linear reachability technique; in a second step, enlarging the solution (a sequence of reach sets with associated time span for some ) by taking the Minkowski sum with a ball of radius where is the interval-arithmetic evaluation of . Moreover, the truncation error converges to zero for increasing and, as we will see in the experiments, typical values of do not have to be prohibitively large to obtain reasonable approximation bounds.
Given a flowpipe, consider any -dimensional reach set , , and its associated time span . Let be the true set of reachable states in the time span and as defined above. Then we have .
This allows us to present a sound reachability method as shown in Algorithm 1. Crucially, we see that in Line 1 we only require the reach sets in the first dimensions. Thus it suffices to compute these sets in a “sparse” way in Line 1. We can use an algorithm based on the support function for to achieve that. In our implementation we use the algorithm from [LeGuernicG10], which takes as input a set of direction vectors in which the reach sets are evaluated.
For dissipative systems, while the solution of the linear system may converge to zero, the corrected term including the error estimate may not. This observation leads to the idea of reevaluating the error estimate after some time , since for fixed and , a decreasing leads to a smaller value which, in turn, reduces the error estimate . This is, however, nontrivial because by the time one reevaluates, the past error estimate must be taken into account and thus the new state estimate at may already be too pessimistic. In the evaluation we apply such a reevaluation manually.
In this section we study two models that have also been used in [liu2021efficient], but we repeat them here to make this article self-contained. In the first model we evaluate all aspects outlined in the present article including the error bounds. In the second model we demonstrate that even if the assumptions for the error bounds do not apply, we can still obtain solutions of useful accuracy.
For comparison we compute an overapproximation of the reachable states for the original nonlinear systems using a Taylor-model (TM) approach implemented in JuliaReach [BogomolovFFPS19, TaylorModels.jl, TaylorIntegration.jl, TaylorSeriesJOSS], with the default parameters (Taylor polynomials with spatial and temporal expansions of orders and respectively), which generally has high precision. To evaluate the flowpipe for the linear system in Algorithm 1, we use the directions for , where is the unit vector in dimension , which corresponds to the outer hyperrectangular approximation of . Note that the number of directions, , is independent of the truncation order . Interval-arithmetic computations are performed using the Julia library IntervalArithmetic.jl [IntervalArithmetic.jl], and for set-based computations we use LazySets.jl [LazySets.jl]. The code and scripts to run these problems is available online. 111github.com/JuliaReach/RP21_RE
There exist several widely used models of population dynamics that generalize the logistic model from Section 3 [brauer2012mathematical]. We consider the popular SEIR epidemic model with data on the early spread of the COVID-19 disease from [seir20]. A population is divided into four compartments: susceptible (), exposed (), infectious (), and recovered (). An individual is initially susceptible and becomes exposed/infected with rate . The latent time before an exposed individual becomes infectious themselves is . Finally, an infectious individual recovers after time . New individuals are added to the population with rate . We also consider a vaccination with rate [ZamanKJ08]. The system of ODEs is:
In this model we assume that remains constant, so we need not model . The corresponding matrices thus simplify to
Since is triangular, . We also have and . Thus we can estimate
The time scale is measured in days. We use the same parameters as in [liu2021efficient]: a population of , is small (here: ), hence the constant term is disregarded in the analysis, , , days, and days. We choose , which results in and and thus Theorem 5.1 is applicable.
|SEIR model||Burgers model|
|no error bound||incl. error bound||initial point||initial set|
|Carleman||: s||: s||: s||: s|
|: s||: s|
The analysis results without and with conservative error estimate are plotted in Figure 2, where we used the discretization step . We can see that the non-conservative Carleman approximation is precise even for the small value . However, the error estimate is too conservative for such small value of thus it is not plotted. However, using the error estimate improves significantly, but only until around time ; this is due to the large values in . At we reevaluate the estimate. Since the state and with it the norm has changed, the new error estimate is more optimistic and converges quickly. The run times are given in Table 1.
We study a model arising from the discretization of a partial differential equation (PDE). Consider the viscous Burgers equation to model convective flow[Burgers48]
We use the following model parameters: viscosity , domain length , and . We consider the initial condition on the domain and Dirichlet conditions at the boundaries. We distribute this initial condition to a set by keeping the end points fixed and enlarging the initial point to some width . For the PDE discretization we use central differences obtaining the coupled differential equations
We use points and . Eq. (7) has the form of (1) that we need to apply Carleman linearization. We obtain but , i.e., as defined in Eq. 5 is not smaller than one. Although the theoretical error bounds from Theorem 5.1 are not applicable here, it is interesting to observe that the set-based solution is reasonably accurate with respect to the solution obtained for the original nonlinear system. In Figure 3 we plot the results at . For the linear reachability algorithm we used the step size . We can see that we still obtain good approximations that decrease exponentially by incrementing the truncation . The run times are given in Table 1.
In this paper we have presented a reachability method that abstracts nonlinear terms into a higher-dimensional space such that the evolution is approximately linear. The main advantage of the method is that we can leverage recent set propagation techniques that are specialized to high-dimensional linear ODEs. However, the method does not apply to general nonlinear systems but requires weak nonlinearity, i.e., the relative norm of the nonlinear term should be smaller than that of the linear term. Under such limitations, the presented method outperforms other reachability methods because linear reachability in high dimension can be solved efficiently.
This work can be extended in several ways. First, we can consider time-dependent terms; an error bound is derived in [liu2021efficient]. Second, in our experimental evaluation we observed that manually reevaluating the error bound can improve the precision if the norm of the states shrinks (which should generally happen for dissipative systems). It would be interesting to automate this process. Third, the reachability analysis can be accelerated, e.g., using Krylov methods to work more efficiently in high dimensions. A more challenging direction is to devise a new reachability algorithm that exploits the structure of the linearized system.
The first author acknowledges fruitful discussions with Amaury Pouly and Goran Frehse. We are grateful to the anonymous reviewers for helpful comments.