Computation of the nearest structured matrix triplet with common null space

We study computational methods for computing the distance to singularity, the distance to the nearest high index problem, and the distance to instability for linear differential-algebraic systems (DAEs) with dissipative Hamiltonian structure. While for general unstructured DAEs the characterization of these distances is very difficult, and partially open, it has been recently shown that for dissipative Hamiltonian systems and related matrix pencils there exist explicit characterizations. We will use these characterizations for the development of computational methods to compute these distances via methods that follow the flow of a differential equation converging to the smallest perturbation that destroys the property of regularity, index one or stability.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

01/24/2020

Distance problems for dissipative Hamiltonian systems and related matrix polynomials

We study the characterization of several distance problems for linear di...
10/13/2019

Structure-preserving Interpolatory Model Reduction for Port-Hamiltonian Differential-Algebraic Systems

We examine interpolatory model reduction methods that are well-suited fo...
10/29/2020

A passivation algorithm for linear time-invariant systems

We propose and study an algorithm for computing a nearest passive system...
01/05/2022

Local and global canonical forms for differential-algebraic equations with symmetries

Linear time-varying differential-algebraic equations with symmetries are...
01/17/2022

Control of port-Hamiltonian differential-algebraic systems and applications

The modeling framework of port-Hamiltonian descriptor systems and their ...
03/15/2012

Approximating Higher-Order Distances Using Random Projections

We provide a simple method and relevant theoretical analysis for efficie...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We derive computational methods for determining the distance to singularity, the distance to the nearest high index problem, and the distance to instability for linear, time-invariant differential-algebraic systems (DAEs) with dissipative Hamiltonian structure (dHDAEs). Such systems arise as linearization of general dHDAEs along a stationary solution and have the form

(1)

with constant coefficient matrices , , and symmetric positive semidefinite, a differentiable state function and a right hand side , see [5, 18, 24, 31, 32, 33, 37, 40, 38, 39] for slightly varying definitions and a detailed analysis of such systems also in the context of the more general port-Hamiltonian systems. The matrix is associated with the Hessian of the associated Hamiltonian energy function, which in the quadratic case has the form . It is well-known [5, 33, 40] that pHDAEs satisfy a dissipation inequality for .

Such pHDAE systems arise in all areas of science and engineering [4, 5, 14, 33, 40] as linearizations, space discretization, or approximation of physical systems and are usually model descriptions with uncertainties. It is therefore important to know whether the model is close to an ill-posed or badly formulated model, and this has been an important research topic recently, see [1, 3, 6, 18, 19, 20, 28, 29, 32, 34]. Since the system properties of (1) are characterized by investigating the corresponding dissipative Hamiltonian (dH) matrix pencil

(2)

the discussed nearness problems can be characterized by determining the distance to the nearest singular pencil, i.e., a pencil with a

identically zero, the distance to the nearest high-index problem, i.e., a problem with Jordan blocks associated to the eigenvalue

of size bigger than one, or the nearest problem on the boundary of the unstable region, i.e. a problem with purely imaginary eigenvalues. To compute these distances is very difficult for general linear systems [7, 8, 10, 18, 19, 20, 22, 30]. However, if one restricts the perturbations to be structured, i.e. one considers structured distances within the class of linear time-invariant dHDAEs, the situation changes completely, see [18, 19, 20, 31, 32], and one obtains very elegant characterizations that can be used in numerical methods to compute these distances.

These methods are usually based on non-convex optimization approaches. In contrast to such approaches, we derive computational methods to compute these structured distances by following the flow of a differential equation. This approach has been shown to be extremely effective for computing the distance to singularity for general matrix pencils [22] and we will show that this holds even more so in the structured case.

Neither the methods based on non-convex optimization nor the methods based on following a flow for general pencils or structured pencils are really feasible for large scale problems. To treat the large sparse case they have to be combined with projections on the sparsity structure and model reduction methods, see [2, 3]

, which intertwine the optimization step with model reduction via interpolation. Here we discuss only the small scale case, but the combination with interpolation methods can be carried out in an analogous way as in

[3].

The paper is organized as follows. In Section 2 we recall a few basic results about linear time-invariant dHDAE systems. In Section 3

we discuss optimization methods that are based on gradient flow computations. Since the cases of even and odd dimension are substantially different, in Section 

4 we specialize these methods for the optimization problems associated with the three discussed distance problems for the case that the state dimension is odd, while in Section 6 we discuss the case that the state dimension is even. Since it is known that the optimal perturbations are rank two matrices, in Section 5 for the odd size case we discuss the special situation that we restrict the perturbation to be at most of rank two. In Section 8 we briefly discuss the iterative procedure for computing the optimal in the upper level of the two level procedure. In all cases, we present numerical examples.

2 Preliminaries

We use the following notation. The set of symmetric (positive semidefinite) matrices in is denoted by (

), and the skew-symmetric matrices in

by . By we denote the Frobenius norm of a (possibly rectangular) matrix , we extend this norm to matrix tuples via . For , we denote by

the Frobenius inner product on , where is the conjugate transpose of . The Euclidian norm in is denoted by . By we denote the smallest eigenvalue of . The real and imaginary part of a complex matrix is denoted by , , respectively.

To characterize the properties of dHDAEs of the form (1), we exploit the Kronecker canonical form of the associated matrix pencil (2), see [17]. If denotes the standard upper triangular Jordan block of size associated with an eigenvalue and denotes the standard right Kronecker block of size , i.e.,

then for there exist nonsingular matrices and that transform the pencil to Kronecker canonical form,

(3)

where and , as well as for and for .

For real matrices and real transformation matrices , the blocks with are in real Jordan canonical form associated to the corresponding pair of conjugate complex eigenvalues, the other blocks are the same. A real or complex eigenvalue is called semisimple if the largest associated Jordan block in the complex Jordan form has size one and the sizes and are called the left and right minimal indices of , respectively. A pencil , is called regular if and for some , otherwise it is called singular; are called the finite eigenvalues of , and is an eigenvalue of if zero is an eigenvalue of the . The size of the largest block is called the index of the pencil .

The definition of stability for differential-algebraic systems varies in the literature. We call a pencil Lyapunov stable (asymptotically stable) if it is regular, all finite eigenvalues are in the closed (open) left half plane, and the ones lying on the imaginary axis (including ) are semisimple [13]. Note that pencils with eigenvalues on the imaginary axis or at are on the boundary of the set of asymptotically systems and those with multiple, but semisimple, purely imaginary eigenvalues (including ) lie on the boundary of the set of Lyapunov stable pencils.

The following theorem summarizes some results of [31, 32] for real dH pencils; note that some of the results also hold in the complex case. Let be symmetric and positive semidefinite, and . Then the following statements hold for the pencil .

  1. If is an eigenvalue of then .

  2. If and is an eigenvalue of , then is semisimple. Moreover, if the columns of form a basis of a regular deflating subspace of associated with , then .

  3. The index of is at most two.

  4. All right and left minimal indices of are zero (if there are any).

  5. The pencil is singular if and only if .

Based on Theorem 2, in [32] the following distance problems were introduced for dH pencils. Let denote the class of square real matrix pencils of the form (2). Then

  1. the structured distance to singularity is defined as

    (4)
  2. the structured distance to the nearest high-index problem is defined as

    (5)
  3. the structured distance to instability is defined as

    (6)

    Here , with , , and .

It has also been shown in [32] that these distances can be characterized as follows. Let . Then the following statements hold.

  1. Define for a matrix , the matrix . The distance to singularity (4) is attained with a perturbation , , and for some with . It is given by

    and is bounded as

    (7)
  2. The structured distance to higher index (5) and the structured distance to instability (6) coincide and satisfy

    and are bounded as

    (8)

With formulas and close upper and lower bounds available, these distances can be computed by global constrained optimization methods such as [36]. Based on our experience in computing the distance to instability for general matrix pencils, where different computational methods were studied and it was shown that gradient flow methods were extremely efficient, in the next section we introduce such gradient methods to compute the discussed structured distances.

3 ODE-based gradient flow approaches

In the previous section we have seen that for dH pencils the distance to singularity is characterized by the distance to the nearest common nullspace of three structured matrices and the distance to high index and instability by the distance to the nearest common nullspace of two symmetric positive definite matrices, with perturbations that keep the structure.

The perturbation matrices that give the structured distance to singularity can be alternatively expressed as

(9)
subj. to
and

Then and our algorithmic approach to minimize this functional is based on this reformulation.

3.1 A two-level minimization

To determine the minimum in (9) we use a two-level minimization. As an inner iteration, for a perturbation size , we consider perturbed matrices , and with satisfying the constraints in (9). Let us denote

  • by

    an eigenvalue/eigenvector pair of

    associated with the smallest eigenvalue and ;

  • by an eigenvalue/eigenvector pair of associated with the smallest eigenvalue, and ;

  • if is even, by an eigenvalue/eigenvector pair of , with such that is the eigenvalue with smallest positive imaginary part and ,

  • if is odd, by an eigenvalue/eigenvector pair of (this exists for all ).

In the inner iteration, for any fixed we compute a (local) minimizer of (9) that is, however, different for even or odd .

The case that is odd

In this case the skew-symmetric matrix always has a zero eigenvalue (with an associated real eigenvector ) so that the only contribution to the optimization is through the alignment of with and . Hence, the functional to be minimized in (9) can be expressed in the simplified form

(10)

with . It is, however, possible to include a further term in the functional, which does not change the solution but may have an impact on the conditioning of the problem and hence the numerical performance.

The case that is even

In this case, when two eigenvalues () coalesce at , they form a semi-simple double eigenvalue and the associated eigenvectors and

form a two-dimensional nullspace spanned by the two real vectors

and . These can be assumed to be orthogonal to each other, i.e. and have the same norm so that still . Using

, we define the real orthogonal matrix

and to satisfy the constraint in (9), we require that

This leads to the minimization of

Since is orthogonal, the solution is , and the functional to be minimized takes the form

which is positive if does not lie in the range of and zero otherwise.

In summary, the functional in the even case is given by

(11)

with .

Remark

In both the odd and the even case we have that

To see this, consider of Frobenius norm less than or equal to giving a minimizer of the left-hand side and suppose that is the minimizing eigenvalue/eigenvector pair of . Then choosing a matrix such that and , for a suitable the matrix is of unit Frobenius norm and has .

Using the functionals (10), respectively (11), in our approach the local minimizer of is determined as an equilibrium point of the associated gradient system. Note, however, that in general this may not be a global minimizer.

For the outer iteration we consider a continuous branch, as a function of , of the minimizers and vary iteratively in order find the smallest solution of the scalar equation

with respect to .

Figure 1: The function in a neighbourhood of . For it becomes identically zero.
Remark

Note that the techniques for the distance to higher index or instability follow directly by setting and not perturbing it.

3.2 Derivatives of eigenvalues and eigenvectors

The considered minimization is an eigenvalue optimization problem. We will solve this problem by integrating a differential(-algebraic) equation with trajectories that follow the gradient descent and satisfy further constraints. To develop such a method, we first recall a classical result, see e.g. [25], for the derivative of a simple eigenvalue and an associated eigenvector of a matrix with respect to variations in a real parameter of the entries. Here we use the notation to denote the derivative with respect to . [25, Section II.1.1] Consider a continuously differentiable matrix valued function , with normal (i.e., for all ). Let be a simple eigenvalue of for all and let with be the associated (right and left) eigenvector. Then is differentiable with

(12)

For consider a perturbation matrix that depends on a real parameter . By Lemma 3.2, for a simple eigenvalue of with associated eigenvector , , we have (omitting the dependence on )

(13)

Similarly, which is needed in the case that is even, for all , if () is a simple eigenvalue of a matrix-valued function , with associated eigenvector , , then we have

(14)

To derive the gradient system associated with our optimization problem, we make use of the following definition. Let be a singular matrix with a simple zero eigenvalue. The group inverse(reduced resolvent) of is the unique matrix satisfying

It is well-known, see [35], that for a singular and normal matrix with simple eigenvalue zero, its group inverse is equal to the Moore-Penrose pseudoinverse . We have the following Lemma. [35, Theorem 2] Consider a sufficiently often differentiable matrix function

Let be a simple eigenvalue of for all and let , with be the associated right eigenvector function. Moreover, let and let be the group inverse of . Then satisfies the system of differential equations

(15)

Moreover, if is pointwise normal, then

(16)

After these preparations, in the following sections we determine the associated gradient systems for the functionals (10), respectively (11).

4 Gradient flow, odd state dimension

In this section we consider the case that the state dimension is odd and construct the gradient system optimization algorithm for the functional (10).

4.1 Computation of the gradient

The functional in (10) has several parts. Applying Lemma 3.2 for perturbations of and of , the computation of the gradient of the part is obtained from the expressions

Considering orthogonal projections with respect to the Frobenius inner product onto the matrix manifold , we identify the constrained gradient directions of these terms as

respectively. (Here denotes proportionality.) In order to treat the other terms, we observe that

and thus

where , is the pseudoinverse of , and is the pseudoinverse of .

Since is odd, which means that (generically) is a simple eigenvalue of , for the last term of (10) we have

where and is the pseudoinverse of .

4.2 The gradient system of ODEs for the flow in the odd case

In order to compute the steepest descent direction, we minimize the gradient of and collect the summands involving , and those involving . Letting

(17)

we have

where we have used the structural properties of (symmetric) and (skew-symmetric) and the property that for real matrices and , . Equation (4.2) identifies the gradient of the functional,

(19)

Since we want to impose a norm constraint on the perturbation we need the following result. [Direction of steepest admissible ascent] Let , with . A solution of the optimization problem

subj. to

is given by

(22)

where is the Frobenius norm of the matrix on the right-hand side. The solution is unique if is not a multiple of . The result follows on noting that the function to minimize is a real inner product on , and the real inner product with a given vector (which here is a matrix) is minimized over a subspace by orthogonally projecting the vector onto that subspace. The expression in (22) is the orthogonal projection of to the tangent space at of the manifold of matrices of unit Frobenius norm.

Taking into consideration projection with respect to the Frobenius inner product of the vector field onto the manifolds of symmetric and skew-symmetric matrices, this leads to the system of differential equations for the perturbation matrices

(23)

where, for , , , and

is used to ensure the norm conservation, i.e. .

Let of unit Frobenius norm satisfy the differential equation (23). If is a simple eigenvalue of , then

(24)

The result follows directly by the fact that (23) is a constrained gradient system.       

In this way we have preserved the symmetry of and the skew-symmetry of . It may happen however, that along the solution trajectory of (23), due to the projection on the matrix manifolds the smallest eigenvalue of and/or the smallest eigenvalue of become negative. In this case the perturbed system is not a dissipative Hamiltonian system any longer. This, however, is in general not an issue for the optimization algorithm, since the dynamical gradient system leads to eigenvalues , , with as small as possible, for a given , and thus drives them to zero when , so that in the limiting situation also the positive semidefiniteness of and holds.

4.3 Stationary points of (23) and low rank property

In this subsection we discuss the existence of stationary points of the solution trajectory of (23). Let be fixed and . Let be a simple eigenvalue of with associate normalized eigenvector , let be a simple eigenvalue of with associate normalized eigenvector , and let be a simple eigenvalue of with associate normalized eigenvector . Then, in the generic situation, i.e., if , , and , we have

(25)

The proofs for the three cases are similar.

  • Exploiting the property that , see [35], we obtain that . If we had , then this would imply and thus, since , we get a contradiction, since .

  • Exploiting the property , we obtain that . If we had , then we get , and again we have a contradiction.

  • Having assumed we have that and are not aligned. As a consequence .       

Using Lemma 4.3, we have the following characterization of stationary points.. Let of unit Frobenius norm satisfy the differential equation (23). Moreover, suppose that for all

and that is a simple eigenvalue of with normalized eigenvector , that is a simple eigenvalue of with associated eigenvector , and that has a null vector .

Then the following are equivalent (here we omit the argument ):

  • ;

  • , , ;

  • is a multiple of the rank- matrix ; is a multiple of the rank- matrix ; is a multiple of the rank- matrix with given by (17).

The proof follows directly by equating to zero the right hand side of (23) and by Lemma 4.3, which prevents the matrices to be zero.

We have also the following extremality property. Consider the functional (10) and suppose that . Let and with . Let be a simple eigenvalue of with associated eigenvector , let be a simple eigenvalue of with associated eigenvector , and let have a null vector . Then the following are equivalent:

  • Every differentiable path (for small ) with the properties that , that both and are simple eigenvalues of and , with associated eigenvectors and , respectively, and for which is the null vector of , so that , satisfies

  • The matrix is a multiple of the rank two matrix , is a multiple of the rank two matrix , and is a multiple of the rank two matrix with given by (17).

Lemma 4.3 ensures that .

Assume that (i) does not hold. Then there exists a path through such that . The steepest descent gradient property shows that also the solution path of (23) passing through is such a path.

Hence is not a stationary point of (23), and Theorem 4.3 then yields that (ii) does not hold.

Conversely, if

then is not a stationary point of (23), and Theorems 4.3 and 4.2 yield that along the solution path of (23).

4.4 Sparsity preservation

If the matrices and have a given sparsity pattern, then we may include as a constraint that the perturbations do not alter the sparsity structure. In terms of the Frobenius norm, it is immediate to obtain the constrained gradient system. Denoting by , , and , respectively, projections onto the manifold of sparse matrices with the given sparsity pattern and structure of , and , then we get

(26)

where

After deriving formulas, in the next subsection we illustrate the properties of the optimization procedure with a numerical example.

4.5 A numerical example

Let and consider the randomly generated matrices

Running the two level iteration with an initial value of the functional , we find a perturbation at a distance (rounded to four digits) with a common null space given by the vector

and the computed perturbations are given by