We derive computational methods for determining the distance to singularity, the distance to the nearest high index problem, and the distance to instability for linear, time-invariant differential-algebraic systems (DAEs) with dissipative Hamiltonian structure (dHDAEs). Such systems arise as linearization of general dHDAEs along a stationary solution and have the form
with constant coefficient matrices , , and symmetric positive semidefinite, a differentiable state function and a right hand side , see [5, 18, 24, 31, 32, 33, 37, 40, 38, 39] for slightly varying definitions and a detailed analysis of such systems also in the context of the more general port-Hamiltonian systems. The matrix is associated with the Hessian of the associated Hamiltonian energy function, which in the quadratic case has the form . It is well-known [5, 33, 40] that pHDAEs satisfy a dissipation inequality for .
Such pHDAE systems arise in all areas of science and engineering [4, 5, 14, 33, 40] as linearizations, space discretization, or approximation of physical systems and are usually model descriptions with uncertainties. It is therefore important to know whether the model is close to an ill-posed or badly formulated model, and this has been an important research topic recently, see [1, 3, 6, 18, 19, 20, 28, 29, 32, 34]. Since the system properties of (1) are characterized by investigating the corresponding dissipative Hamiltonian (dH) matrix pencil
the discussed nearness problems can be characterized by determining the distance to the nearest singular pencil, i.e., a pencil with a
identically zero, the distance to the nearest high-index problem, i.e., a problem with Jordan blocks associated to the eigenvalueof size bigger than one, or the nearest problem on the boundary of the unstable region, i.e. a problem with purely imaginary eigenvalues. To compute these distances is very difficult for general linear systems [7, 8, 10, 18, 19, 20, 22, 30]. However, if one restricts the perturbations to be structured, i.e. one considers structured distances within the class of linear time-invariant dHDAEs, the situation changes completely, see [18, 19, 20, 31, 32], and one obtains very elegant characterizations that can be used in numerical methods to compute these distances.
These methods are usually based on non-convex optimization approaches. In contrast to such approaches, we derive computational methods to compute these structured distances by following the flow of a differential equation. This approach has been shown to be extremely effective for computing the distance to singularity for general matrix pencils  and we will show that this holds even more so in the structured case.
Neither the methods based on non-convex optimization nor the methods based on following a flow for general pencils or structured pencils are really feasible for large scale problems. To treat the large sparse case they have to be combined with projections on the sparsity structure and model reduction methods, see [2, 3]
, which intertwine the optimization step with model reduction via interpolation. Here we discuss only the small scale case, but the combination with interpolation methods can be carried out in an analogous way as in.
we discuss optimization methods that are based on gradient flow computations. Since the cases of even and odd dimension are substantially different, in Section4 we specialize these methods for the optimization problems associated with the three discussed distance problems for the case that the state dimension is odd, while in Section 6 we discuss the case that the state dimension is even. Since it is known that the optimal perturbations are rank two matrices, in Section 5 for the odd size case we discuss the special situation that we restrict the perturbation to be at most of rank two. In Section 8 we briefly discuss the iterative procedure for computing the optimal in the upper level of the two level procedure. In all cases, we present numerical examples.
We use the following notation. The set of symmetric (positive semidefinite) matrices in is denoted by (
), and the skew-symmetric matrices inby . By we denote the Frobenius norm of a (possibly rectangular) matrix , we extend this norm to matrix tuples via . For , we denote by
the Frobenius inner product on , where is the conjugate transpose of . The Euclidian norm in is denoted by . By we denote the smallest eigenvalue of . The real and imaginary part of a complex matrix is denoted by , , respectively.
To characterize the properties of dHDAEs of the form (1), we exploit the Kronecker canonical form of the associated matrix pencil (2), see . If denotes the standard upper triangular Jordan block of size associated with an eigenvalue and denotes the standard right Kronecker block of size , i.e.,
then for there exist nonsingular matrices and that transform the pencil to Kronecker canonical form,
where and , as well as for and for .
For real matrices and real transformation matrices , the blocks with are in real Jordan canonical form associated to the corresponding pair of conjugate complex eigenvalues, the other blocks are the same. A real or complex eigenvalue is called semisimple if the largest associated Jordan block in the complex Jordan form has size one and the sizes and are called the left and right minimal indices of , respectively. A pencil , is called regular if and for some , otherwise it is called singular; are called the finite eigenvalues of , and is an eigenvalue of if zero is an eigenvalue of the . The size of the largest block is called the index of the pencil .
The definition of stability for differential-algebraic systems varies in the literature. We call a pencil Lyapunov stable (asymptotically stable) if it is regular, all finite eigenvalues are in the closed (open) left half plane, and the ones lying on the imaginary axis (including ) are semisimple . Note that pencils with eigenvalues on the imaginary axis or at are on the boundary of the set of asymptotically systems and those with multiple, but semisimple, purely imaginary eigenvalues (including ) lie on the boundary of the set of Lyapunov stable pencils.
The following theorem summarizes some results of [31, 32] for real dH pencils; note that some of the results also hold in the complex case. Let be symmetric and positive semidefinite, and . Then the following statements hold for the pencil .
If is an eigenvalue of then .
If and is an eigenvalue of , then is semisimple. Moreover, if the columns of form a basis of a regular deflating subspace of associated with , then .
The index of is at most two.
All right and left minimal indices of are zero (if there are any).
The pencil is singular if and only if .
the structured distance to singularity is defined as
the structured distance to the nearest high-index problem is defined as
the structured distance to instability is defined as
Here , with , , and .
It has also been shown in  that these distances can be characterized as follows. Let . Then the following statements hold.
Define for a matrix , the matrix . The distance to singularity (4) is attained with a perturbation , , and for some with . It is given by
and is bounded as
With formulas and close upper and lower bounds available, these distances can be computed by global constrained optimization methods such as . Based on our experience in computing the distance to instability for general matrix pencils, where different computational methods were studied and it was shown that gradient flow methods were extremely efficient, in the next section we introduce such gradient methods to compute the discussed structured distances.
3 ODE-based gradient flow approaches
In the previous section we have seen that for dH pencils the distance to singularity is characterized by the distance to the nearest common nullspace of three structured matrices and the distance to high index and instability by the distance to the nearest common nullspace of two symmetric positive definite matrices, with perturbations that keep the structure.
The perturbation matrices that give the structured distance to singularity can be alternatively expressed as
Then and our algorithmic approach to minimize this functional is based on this reformulation.
3.1 A two-level minimization
To determine the minimum in (9) we use a two-level minimization. As an inner iteration, for a perturbation size , we consider perturbed matrices , and with satisfying the constraints in (9). Let us denote
an eigenvalue/eigenvector pair ofassociated with the smallest eigenvalue and ;
by an eigenvalue/eigenvector pair of associated with the smallest eigenvalue, and ;
if is even, by an eigenvalue/eigenvector pair of , with such that is the eigenvalue with smallest positive imaginary part and ,
if is odd, by an eigenvalue/eigenvector pair of (this exists for all ).
In the inner iteration, for any fixed we compute a (local) minimizer of (9) that is, however, different for even or odd .
The case that is odd
In this case the skew-symmetric matrix always has a zero eigenvalue (with an associated real eigenvector ) so that the only contribution to the optimization is through the alignment of with and . Hence, the functional to be minimized in (9) can be expressed in the simplified form
with . It is, however, possible to include a further term in the functional, which does not change the solution but may have an impact on the conditioning of the problem and hence the numerical performance.
The case that is even
In this case, when two eigenvalues () coalesce at , they form a semi-simple double eigenvalue and the associated eigenvectors and
form a two-dimensional nullspace spanned by the two real vectorsand . These can be assumed to be orthogonal to each other, i.e. and have the same norm so that still . Using
, we define the real orthogonal matrix
and to satisfy the constraint in (9), we require that
This leads to the minimization of
Since is orthogonal, the solution is , and the functional to be minimized takes the form
which is positive if does not lie in the range of and zero otherwise.
In summary, the functional in the even case is given by
In both the odd and the even case we have that
To see this, consider of Frobenius norm less than or equal to giving a minimizer of the left-hand side and suppose that is the minimizing eigenvalue/eigenvector pair of . Then choosing a matrix such that and , for a suitable the matrix is of unit Frobenius norm and has .
Using the functionals (10), respectively (11), in our approach the local minimizer of is determined as an equilibrium point of the associated gradient system. Note, however, that in general this may not be a global minimizer.
For the outer iteration we consider a continuous branch, as a function of , of the minimizers and vary iteratively in order find the smallest solution of the scalar equation
with respect to .
Note that the techniques for the distance to higher index or instability follow directly by setting and not perturbing it.
3.2 Derivatives of eigenvalues and eigenvectors
The considered minimization is an eigenvalue optimization problem. We will solve this problem by integrating a differential(-algebraic) equation with trajectories that follow the gradient descent and satisfy further constraints. To develop such a method, we first recall a classical result, see e.g. , for the derivative of a simple eigenvalue and an associated eigenvector of a matrix with respect to variations in a real parameter of the entries. Here we use the notation to denote the derivative with respect to . [25, Section II.1.1] Consider a continuously differentiable matrix valued function , with normal (i.e., for all ). Let be a simple eigenvalue of for all and let with be the associated (right and left) eigenvector. Then is differentiable with
For consider a perturbation matrix that depends on a real parameter . By Lemma 3.2, for a simple eigenvalue of with associated eigenvector , , we have (omitting the dependence on )
Similarly, which is needed in the case that is even, for all , if () is a simple eigenvalue of a matrix-valued function , with associated eigenvector , , then we have
To derive the gradient system associated with our optimization problem, we make use of the following definition. Let be a singular matrix with a simple zero eigenvalue. The group inverse(reduced resolvent) of is the unique matrix satisfying
It is well-known, see , that for a singular and normal matrix with simple eigenvalue zero, its group inverse is equal to the Moore-Penrose pseudoinverse . We have the following Lemma. [35, Theorem 2] Consider a sufficiently often differentiable matrix function
Let be a simple eigenvalue of for all and let , with be the associated right eigenvector function. Moreover, let and let be the group inverse of . Then satisfies the system of differential equations
Moreover, if is pointwise normal, then
4 Gradient flow, odd state dimension
In this section we consider the case that the state dimension is odd and construct the gradient system optimization algorithm for the functional (10).
4.1 Computation of the gradient
Considering orthogonal projections with respect to the Frobenius inner product onto the matrix manifold , we identify the constrained gradient directions of these terms as
respectively. (Here denotes proportionality.) In order to treat the other terms, we observe that
where , is the pseudoinverse of , and is the pseudoinverse of .
Since is odd, which means that (generically) is a simple eigenvalue of , for the last term of (10) we have
where and is the pseudoinverse of .
4.2 The gradient system of ODEs for the flow in the odd case
In order to compute the steepest descent direction, we minimize the gradient of and collect the summands involving , and those involving . Letting
where we have used the structural properties of (symmetric) and (skew-symmetric) and the property that for real matrices and , . Equation (4.2) identifies the gradient of the functional,
Since we want to impose a norm constraint on the perturbation we need the following result. [Direction of steepest admissible ascent] Let , with . A solution of the optimization problem
is given by
where is the Frobenius norm of the matrix on the right-hand side. The solution is unique if is not a multiple of . The result follows on noting that the function to minimize is a real inner product on , and the real inner product with a given vector (which here is a matrix) is minimized over a subspace by orthogonally projecting the vector onto that subspace. The expression in (22) is the orthogonal projection of to the tangent space at of the manifold of matrices of unit Frobenius norm.
Taking into consideration projection with respect to the Frobenius inner product of the vector field onto the manifolds of symmetric and skew-symmetric matrices, this leads to the system of differential equations for the perturbation matrices
where, for , , , and
is used to ensure the norm conservation, i.e. .
Let of unit Frobenius norm satisfy the differential equation (23). If is a simple eigenvalue of , then
The result follows directly by the fact that (23) is a constrained gradient system.
In this way we have preserved the symmetry of and the skew-symmetry of . It may happen however, that along the solution trajectory of (23), due to the projection on the matrix manifolds the smallest eigenvalue of and/or the smallest eigenvalue of become negative. In this case the perturbed system is not a dissipative Hamiltonian system any longer. This, however, is in general not an issue for the optimization algorithm, since the dynamical gradient system leads to eigenvalues , , with as small as possible, for a given , and thus drives them to zero when , so that in the limiting situation also the positive semidefiniteness of and holds.
4.3 Stationary points of (23) and low rank property
In this subsection we discuss the existence of stationary points of the solution trajectory of (23). Let be fixed and . Let be a simple eigenvalue of with associate normalized eigenvector , let be a simple eigenvalue of with associate normalized eigenvector , and let be a simple eigenvalue of with associate normalized eigenvector . Then, in the generic situation, i.e., if , , and , we have
The proofs for the three cases are similar.
Exploiting the property that , see , we obtain that . If we had , then this would imply and thus, since , we get a contradiction, since .
Exploiting the property , we obtain that . If we had , then we get , and again we have a contradiction.
Having assumed we have that and are not aligned. As a consequence .
and that is a simple eigenvalue of with normalized eigenvector , that is a simple eigenvalue of with associated eigenvector , and that has a null vector .
Then the following are equivalent (here we omit the argument ):
, , ;
is a multiple of the rank- matrix ; is a multiple of the rank- matrix ; is a multiple of the rank- matrix with given by (17).
We have also the following extremality property. Consider the functional (10) and suppose that . Let and with . Let be a simple eigenvalue of with associated eigenvector , let be a simple eigenvalue of with associated eigenvector , and let have a null vector . Then the following are equivalent:
Every differentiable path (for small ) with the properties that , that both and are simple eigenvalues of and , with associated eigenvectors and , respectively, and for which is the null vector of , so that , satisfies
The matrix is a multiple of the rank two matrix , is a multiple of the rank two matrix , and is a multiple of the rank two matrix with given by (17).
Lemma 4.3 ensures that .
Assume that (i) does not hold. Then there exists a path through such that . The steepest descent gradient property shows that also the solution path of (23) passing through is such a path.
4.4 Sparsity preservation
If the matrices and have a given sparsity pattern, then we may include as a constraint that the perturbations do not alter the sparsity structure. In terms of the Frobenius norm, it is immediate to obtain the constrained gradient system. Denoting by , , and , respectively, projections onto the manifold of sparse matrices with the given sparsity pattern and structure of , and , then we get
After deriving formulas, in the next subsection we illustrate the properties of the optimization procedure with a numerical example.
4.5 A numerical example
Let and consider the randomly generated matrices
Running the two level iteration with an initial value of the functional , we find a perturbation at a distance (rounded to four digits) with a common null space given by the vector
and the computed perturbations are given by