    # Distance problems for dissipative Hamiltonian systems and related matrix polynomials

We study the characterization of several distance problems for linear differential-algebraic systems with dissipative Hamiltonian structure. Since all models are only approximations of reality and data are always inaccurate, it is an important question whether a given model is close to a 'bad' model that could be considered as ill-posed or singular. This is usually done by computing a distance to the nearest model with such properties. We will discuss the distance to singularity and the distance to the nearest high index problem for dissipative Hamiltonian systems. While for general unstructured differential-algebraic systems the characterization of these distances are partially open problems, we will show that for dissipative Hamiltonian systems and related matrix polynomials there exist explicit characterizations that can be implemented numerically.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

We study several distance problems for linear systems of differential-algebraic equations (DAEs) of the form

 E˙x=(J−R)Qx, (1)

with constant coefficient matrices and a differentiable state function , see also [3, 16, 22, 30, 31, 32, 33, 34] for definitions and a detailed analysis of such systems in different generality and their relation to the more general port-Hamiltonian systems. A system of the above form is called linear time-invariant dissipative Hamiltonian (dH) differential-algebraic equation (dHDAE) system if

 E⊤Q≥0,J=−J⊤,R=R⊤≥0, (2)

where denotes the transpose of a matrix and for a symmetric matrix by () we denote that is positive definite (positive semidefinite). Such dHDAE systems generalize linear time-invariant ordinary dissipative Hamiltonian systems (the case where ) is the identity) and linear Hamiltonian systems, the case that and . The associated quadratic Hamiltonian is given by and satisfies the dissipation inequality for . In many applications the matrix

can be chosen to be the identity matrix, i.e.,

, see [3, 16, 31, 32, 35], and this is the case that we study in this paper. In Section 6.3 we provide an analysis how the general case (1) can be transformed to this situation.

The system properties of (1) (with ) can be analyzed by investigating the corresponding dH matrix pencil

 L(λ):=λE−(J−R). (3)

In our analysis we focus on systems with real coefficients. Some of our results can also easily be extended to the case of complex coefficients, but in some occasions we make explicit or implicit use of the fact that skew-symmetric matrices have a zero diagonal which is not true for skew-Hermitian matrices.

In many practical cases, see, e.g., [3, 20], the underlying system is of second order form with the underlying quadratic matrix polynomial

 P(λ):=λ2M−λ(G−D)+K (4)

where satisfy and . As we will see below, this can be viewed as a generalization of the dH structure to second order systems. The second order case can be easily rewritten in first order dH form but we will treat the problem directly in second order, and we will also discuss appropriate higher degree matrix polynomials with an analogous structure.

Linear time invariant systems with the described structure are very common in all areas of science and engineering [3, 31, 35] and typically arise via linearization arround a stationary solution. However, since all mathematical models of physical systems are usually only approximations of reality and data are typically inaccurate, it is an important question whether a given model is close to a model with ’bad properties’ such as an ill-posed model without or with non-unique solution. To answer such questions for dH and port-Hamiltonian systems has been an important research topic in recent years, see, e.g., [1, 2, 16, 17, 18, 27, 28].

To classify whether a model is close to a ’bad model’, one usually computes the distance to the nearest model with the ’bad’ property. In this paper we will discuss the distance to the set of singular matrix polynomials, i.e., those with a determinant that is identically zero, and the distance to the nearest high-index problem, i.e., a problem with Jordan blocks associated to the eigenvalue

of size bigger than one.

While for general unstructured DAE systems the characterization of these two distances is very difficult and partially open [4, 5, 7, 21, 29], the picture changes if one considers structured distances, i.e., distances within the set of linear constant coefficient dHDAE systems. In this paper, we will make use of previous results from [16, 30] to derive explicit characterizations for computing these distances in terms of null-spaces of several matrices.

We use the following notation. By we denote the Frobenius norm of a (possibly rectangular) matrix , we extend this norm to matrix polynomials by setting . By we denote the smallest eigenvalue of a positive semidefinite matrix .

The paper is organized as follows. In Section 2 we recall a few basic results about linear time-invariant dH systems. In Section 3 we present the different distances and state the main results for first order systems. Instead of immediately presenting the corresponding proofs, we first consider related distance problems for a more general polynomial structure in Section 4. These distance characterizations are then specialized in Section 5 to prove the main results for the first order case. In Section 6 we consider corresponding distances for analogous quadratic matrix polynomials and also show how different representations of the first order case can be related.

## 2 Preliminaries

We will make use of the Kronecker canonical form of a matrix pencil . Let us denote by the standard upper triangular Jordan block of size associated with the eigenvalue and let denote the standard right Kronecker block of size , i.e.,

###### Theorem 1 (Kronecker canonical form)

Let . Then there exist nonsingular matrices and such that

 S(λE−A)T=\rm diag(Lϵ1,…,Lϵp,L⊤η1,…,L⊤ηq,Jλ1ρ1,…,Jλrρr,Nσ1,…,Nσs), (5)

where and , as well as for and for . This form is unique up to permutation of the blocks.

For real matrices (the case we discuss), a real version of the Kronecker canonical form is obtained under real transformation matrices . In this case the blocks with have to be replaced with corresponding blocks in real Jordan canonical form associated to the corresponding pair of conjugate complex eigenvalues, but the other blocks have the same structure as in the complex case. An eigenvalue is called semisimple if the largest associated Jordan block has size one.

The sizes and of the rectangular blocks are called the left and right minimal indices of , respectively. The matrix pencil , is called regular if and for some , otherwise it is called singular. A pencil is singular if and only if it has blocks of at least one of the types or in the Kronecker canonical form.

The values are called the finite eigenvalues of . If , then is said to be an eigenvalue of . (Equivalently, zero is then an eigenvalue of the reversal of the pencil .) The sum of all sizes of blocks that are associated with a fixed eigenvalue is called the algebraic multiplicity of . The size of the largest block is called the index of the pencil , where, by convention, if is invertible. The pencil is called stable if it is regular and if all eigenvalues are in the closed left half plane, and the ones lying on the imaginary axis (including infinity) have the largest associated block of size at most one. Otherwise the pencil is called unstable.

The following result was shown in . We state the result in full generality, but clearly all statements also hold for the special case that are real and that which is the case considered in this paper.

###### Theorem 2

Let satisfy and let all left minimal indices of be equal to zero (if there are any). Furthermore, let be such that we have , . Then the following statements hold for the pencil .

1. If is an eigenvalue of then .

2. If and is an eigenvalue of , then is semisimple. Moreover, if the columns of form a basis of a regular deflating subspace of associated with , then .

If, additionally, is nonsingular then the previous statement holds for as well. If is singular then need not be semisimple, but if is regular, then Jordan blocks associated with have size at most two.

3. The index of is at most two.

4. All right minimal indices of are at most one (if there are any).

5. If in addition is regular, then all left minimal indices of are zero (if there are any).

Proof. For the proof see . The additional statement in (ii) on the eigenvalue was not presented in , but it follows in a straightforward manner from [30, Theorem 6.1] and the proof of [30, Corollary 6.2].

Theorem 2 illustrates that the special structure of dH systems imposes many restrictions in the spectral data and this has also an advantage when determining the distances to the nearest ’bad’ problem. In particular, Theorem 2 implies that the distance to instability and the distance to higher index coincide for a pencil with nonsingular.

The following well-known lemma, see [6, 23] (also stated for the general complex case), will be needed in order to make statements about the index of a matrix pencil in special situations.

###### Lemma 3

Let be matrices of the form

 E=[E11000]andA=[A11A12A21A22],

where is invertible.

1. If is invertible, then the pencil is regular and has index one;

2. if is singular, then the pencil is singular or has an index greater than or equal to two.

## 3 Problem statement and main results for dHDAE systems

We are interested in the following distance problems for matrix pencils of the form (3) under perturbations that preserve the special structure of the pencil.

###### Definition 4

Let denote the class of square real matrix pencils of the form (3). Then

1. the structured distance to singularity is defined as

 dLsing(L(λ)):=inf{∥∥ΔL(λ)∥∥F ∣∣ L(λ)+ΔL(λ)∈L and is singular}; (6)
2. the structured distance to the nearest high-index problem is defined as

 dLhi(L(λ)):=inf{∥∥ΔL(λ)∥∥F ∣∣ L(λ)+ΔL(λ)∈L and is of index≥2}; (7)
3. the structured distance to instability is defined as

 dLinst(L(λ)):=inf{∥∥ΔL(λ)∥∥F ∣∣ L(λ)+ΔL(λ)∈L and is unstable}. (8)

Note that all defined distances are meaningful, as for each matrix the decomposition into a sum of a skew-symmetric matrix and symmetric matrix is unique. Furthermore, we have due to the trace of being zero. Thus, the constraint in (6)–(8) is the same as writing

 ΔL(λ)=λΔE−(ΔJ−ΔR),

with and , and we have . The positivity conditions for are crucial. Examples presented in Section 5.2 show that they can neither be omitted nor simplified to being merely symmetric.

###### Theorem 5

Let . Then the following statements hold.

1. The pencil is singular if and only if . In that case there exists an orthogonal transformation matrix such that

 U⊤EU=[E11000],U⊤JU=[J11000],U⊤RU=[R11000],

where the pencil is regular and has the size with . In particular, all right and left minimal indices of in its Kronecker canonical form are zero.

2. The index of is at most two. Furthermore, the following statements are equivalent.

1. For any there exists a pencil with , and which is regular and of index two such that

 (9)

i.e., is in the closure of the set of regular dH pencils of index two.

2. .

To construct the perturbations where the distance to singularity ia achieved, we use the following ansatz. For a matrix

and a vector

with we define the matrix

 ΔuY=−uu⊤Y−Yuu⊤+uu⊤Yuu⊤, (10)

that will be used at several occasions during the paper. Then we obtain the following characterization of the distance to singularity.

###### Theorem 6

Let . Then the following statements hold.

1. The distance to singularity (6) is attained with a perturbation , , and as in (10) for some with . The distance is given by

and is bounded as

 √λmin(−J2+R2+E2)≤dLsing(λE−(J−R))≤√2⋅λmin(−J2+R2+E2). (11)
2. The distance to higher index (7) and the distance to instability (8) coincide and satisfy

 dLhi(λE−(J−R))=dLinst(λE−(J−R))=minu∈Rn∥u∥=1√2∥∥(I−uu⊤)Eu∥∥2+(u⊤Eu)2+2∥∥(I−uu⊤)Ru∥∥2+(u⊤Ru)2

and are bounded as

The proofs of Theorems 5 and 6 are given in Section 5.1, where they are obtained as simple consequences of a general theory developed in Section 4.2 for matrix polynomials with a special symmetry structure. Before we give the proofs, we will first consider a more general minimization problem in the next section.

## 4 General distance problems

In this section, we present a solution to a quite general minimization problem. This will allow us to solve the distance problems for dH pencils introduced in Section 3 as well as analogous problems for structured matrix polynomials with a dH like structure in a unified manner.

Theorem 5 states that both the distance to singularity as well as to higher index for a dH pencil as in (3) can be expressed via the existence of a common kernel of two or three structured matrices, so that both problems can be reinterpreted as a distance problem to the common kernel of matrices with symmetry and positivity structures. This concept will now be extended to more than three matrices.

### 4.1 Distance to the common kernel of a tuple of structured matrices

###### Definition 7

Let denote the following set of -tuples of real matrices

where and are fixed. For a given tuple we define the structured distance to the common kernel as

 inf{∥∥[ΔJ,ΔX0,…,ΔXℓ]∥∥F∣∣ (J+ΔJ,X0+ΔX0,…,Xℓ+ΔXℓ)∈Snℓ,ker(J+ΔJ)∩⋂ℓi=0ker(Xi+ΔXi)≠{0}}. (12)

In the following, we often drop the dependence on and in the notation for simplicity, thus writing .

Observe that in determining we measure the distance to a closed set.

###### Lemma 8

The set of all satisfying is a closed subset in .

Proof. The proof follows by considering sequences of tuples and a convergent subsequence of a sequence of unit vectors satisfying

 J(m)um=X(m)ium=0,i=1,…ℓ. {\ \vbox{% \hrule width 100% height 1px\hbox{\vrule height 5.59pt width 1px \vrule width % 1px}\hrule width 100% height 1px}} \par

Before we present the solution of the minimization problem, we first develop equivalent conditions for to have a nontrivial common kernel.

###### Proposition 9

Let . Then

 kerJ∩kerX0∩…kerXℓ=ker(J⊤J+X20+⋯+X2ℓ)=ker(−J+X0+⋯+Xℓ). (13)

Furthermore, there exists an orthogonal matrix

such that

 U⊤JU=[˜J000],U⊤XiU=[˜Xi000],i=0,…,ℓ (14)

with some , where and where the matrix is invertible.

Proof. The inclusion is trivial. To prove the converse, let be nonzero. Since each summand is positive semidefinite, we obtain and . Noting that holds for any symmetric or skew-symmetric matrix finishes the proof.

The inclusion is again trivial. To prove the converse let be nonzero. Since , we obtain that and since each of the matrices is positive semidefinite, we obtain , which then implies as well.

To prove the last assertion, let be an orthogonal matrix with last columns spanning the kernel of . Note that if is one of those last columns of then (13) implies that and for , which shows the formula (14).

###### Remark 10

We highlight that the nonegativity assumption for the matrices is crucial for the two nontrivial inclusions in Proposition 9. For example, consider

 J=[01−10]andX0=[100−1].

Then is singular while the intersection of the kernels of and is trivial.

Also note that while an arbitrarily large number of symmetric positive semidefinite matrices can be considered, the results from Proposition 9 are no longer true if a second skew-symmetric matrix is involved. For example, consider the matrices

 J1=⎡⎢⎣010−100000⎤⎥⎦andJ2=⎡⎢⎣001000−100⎤⎥⎦

Then is singular (in fact, even the pencil is singular), but and do not have a common kernel.

Given as in Proposition 9, we aim to characterize all perturbations that produce a nontrivial common kernel of the matrices while preserving their individual structures. For this, we will use particular perturbations whose special properties will be presented in the following lemma.

###### Lemma 11

Let , let be a vector with and let

 ΔuY:=−uu⊤Y−Yuu⊤+uu⊤Yuu⊤. (15)

Then the following statements hold.

1. , in particular, is singular.

2. , and if and only if

is a right or left eigenvector of

.

3. .

4. If then and .

5. If , then and .

Proof. (i) immediately follows from . For the proof of (ii) let be an orthogonal matrix with last column . Then we obtain

 (16)

for some with which immediately shows that . In particular, we have if and only if or which is equivalent to being a right or left eigenvector of , respectively. Moreover, (iii) immediately follows from the representation (16) using that

 (I−uu⊤)Yu=[Y120],uY(I−uu⊤)=[Y210],andY22=u⊤Yu.

Finally, using the additional (skew-)symmetry structure, we obtain (iv) and (v), where the part in (iv) again follows from the representation (16).

We highlight that the first property of statement (iv) in Lemma 11 will become essential in what follows, because it allows us to perform a perturbation that makes a symmetric matrix singular while simultaneously preserving the positive semidefiniteness of the matrix. With these preparations, we obtain the following theorem that characterizes structure-preserving perturbations to matrices with a nontrivial common kernel.

###### Theorem 12

Let , i.e., and for . Furthermore, for any , , consider the perturbation matrices

 ΔuJ:=−uu⊤J−Juu⊤andΔuXi:=−uu⊤Xi−Xiuu⊤+uu⊤Xiuu⊤,i=0,…,ℓ. (17)

Then the following statements hold.

1. For any vector , , we have

 (ΔuJ)⊤=−ΔuJ,as well as(ΔuXi)⊤=ΔuXiandXi+ΔuXi≥0,i=0,…,ℓ. (18)

Furthermore, the kernels of the matrices , have a nontrivial intersection.

2. For any vector , , we have

 ∥∥ΔuJ∥∥2F=2∥Ju∥2,and∥∥ΔuXi∥∥2F=2∥∥(I−uu⊤)Xiu∥∥2+(u⊤Xiu)2,i=0,…,ℓ.
3. Let be any perturbation matrices satisfying

 Δ⊤J=−ΔJas well asΔ⊤Xi=ΔXi,andXi+ΔXi≥0,i=0,…,ℓ, (19)

and such that the kernels of the matrices , have a nontrivial intersection. Then

for some real vector with

Proof. (i) and (ii) follow immediately from Lemma 11. To prove (iii), consider any perturbation matrices satisfying (19) such that the kernels of the matrices , have a nontrivial intersection. Then by Proposition 9, there exists an orthogonal matrix such that

 (20)

with some , not necessarily invertible, i.e., in contrast to (14) we split only one vector from the intersection of kernels. Transforming and decomposing accordingly, we have

 U⊤JU=[˜Kt−t⊤0],andU⊤XiU=[˜Sisis⊤iri],i=0,…,ℓ (21)

for some skew-symmetric matrix , some symmetric matrices , some , (), and some . Subtracting (21) from (20), we obtain that

 (22)

Observe that for the particular choice the perturbations from (15) have, by (21), the forms

 U⊤ΔuJU=−[0−tt⊤0]andU⊤ΔuXiU=−[0sis⊤iri],i=0,…,ℓ. (23)

Since the Frobenius norm is invariant under real orthogonal transformations, we immediately obtain that and for .

We now have all ingredients to state and prove the solution of our general minimization problem.

###### Theorem 13

Let , i.e., and for . Then the structured distance to the common kernel (12) is attained at , being as in (17) for some with . Consequently,

and in addition, we have the bounds

 √λmin(−J2+X20+⋯+X2ℓ)≤dSker(J,X0,…,Xℓ)≤√2⋅λmin(−J2+X20+⋯+X2ℓ). (24)

Proof. The first two statements follow directly from Theorem 12. It remains to prove (24). For this aim note that for every with , we have

Taking the infimum over all with shows (24).

###### Remark 14

In the special case it immediately follows that . This is in line with Proposition 9, because the singularity of the matrix is equivalent to the existence of a nontrivial common kernel of the matrices .

### 4.2 Distance problems for structured matrix polynomials

As a first application of the results from Subsection 4.1, we will consider distance problems for a particular class of structured matrix polynomials. To this end, recall from  that by definition a square matrix polynomial is singular if and only if . Also recall that the companion linearization

 (25)

of is a strong linearisation in the sense of . In particular, is singular if and only if is singular. Furthermore, as shown in , in the linearization the spectral data for eigenvalues of is preserved. Therefore, for the sake of simplicity, we define the notions of index and instability for the matrix polynomial via the respective notions of the Kronecker canonical form of (25), cf. Section 2. We then extend Definition 4 as follows.

###### Definition 15

Consider the class of matrix polynomials

 Pnk,j:={−λjJ+k∑i=0λiAi∣∣ ∣∣J⊤=−J, A⊤i=Ai≥0∈Rn,n, i=0,…,k},

where , and, without loss of generality, . Then for

1. the structured distance to singularity is defined as

 (26)
2. the structured distance to the nearest high index problem is defined as

 dPnk,jhi(P(λ)):=inf{∥∥ΔP(λ)∥∥F ∣∣ P(λ)+ΔP(λ)∈Pnk,j is of index≥2}; (27)
3. the structured distance to instability is defined as

 dPnk,jinst(P(λ)):=inf{∥∥ΔP(λ)∥∥F ∣∣ P(λ)+ΔP(λ)∈Pnk,j is unstable}. (28)

We often simply write instead of for .

In other words, consists of the set of matrix polynomials of degree less than or equal to for which all coefficients are symmetric positive semidefinite except for the coefficient at which is only assumed to have a positive semidefinite symmetric part. Particular examples for this kind of matrix polynomials are the dH pencils of the form (3), i.e., the set , and quadratic matrix polynomials of the form (4), i.e., the set . Observe that if both then must take the form

 ΔP(λ)=−λjΔJ+k∑i=0λiΔAi,

where and . We have the following theorem for characterizing the distance to the nearest singular or high index matrix polynomial.

###### Theorem 16

Let and and consider the set of matrix polynomials

 P(λ)=−λjJ+k∑i=0λiAi.

with , and for .

1. If then the following statements are equivalent:

1. the polynomial is singular, i.e., ;

2. the matrix is singular;

3. the kernels of the matrices have a nontrivial intersection.

2. If , then its distance to the set of singular matrix polynomials in equals the distance to the common kernel of the matrices , i.e.,

 dPsing(P(λ))=dSker(J,A0,…,Ak). (29)
3. If , then the closure of the set

 Ihi:={P(λ)∈Pnk,j∣∣P(λ)\rm is regular and has index greater than one} (30)

in is equal to

 (31)

or to

If or , then is empty.

4. Let . If , then the distance of to the set of higher index polynomials in equals the distance to the respective common kernel

 dPhi(P(λ))={dSker(0,Ak,Ak−1)if j1. (32)

If