Exponential Condition Number of Solutions of the Discrete Lyapunov Equation

03/11/2018
by   Andrew Mullhaupt, et al.
0

The condition number of the n x n matrix P is examined, where P solves x d matrix. Lower bounds on the condition number, κ, of P are given when A is normal, a single Jordan block or in Frobenius form. The bounds show that the ill-conditioning of P grows as (n/d) >> 1. These bounds are related to the condition number of the transformation that takes A to input normal form. A simulation shows that P is typically ill-conditioned in the case of n>>1 and d=1. When A_ij has an independent Gaussian distribution (subject to restrictions), we observe that κ(P)^1/n = 3.3. The effect of auto-correlated forcing on the conditioning on state space systems is examined

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

08/29/2019

Quantifying the ill-conditioning of analytic continuation

Analytic continuation is ill-posed, but becomes merely ill-conditioned (...
09/21/2021

Conditioning of a Hybrid High-Order scheme on meshes with small faces

We conduct a condition number analysis of a Hybrid High-Order (HHO) sche...
09/18/2021

Graph-Theoretical Based Algorithms for Structural Optimization

Five new algorithms were proposed in order to optimize well conditioning...
04/19/2020

Accuracy and Conditioning of Surface-Source Based Near-Field to Far-Field Transformations

The conditioning and accuracy of various inverse surface-source formulat...
05/03/2018

NFL Injuries Before and After the 2011 Collective Bargaining Agreement (CBA)

The National Football League's (NFL) 2011 collective bargaining agreemen...
09/26/2020

The smoothed complexity of Frank-Wolfe methods via conditioning of random matrices and polytopes

Frank-Wolfe methods are popular for optimization over a polytope. One of...
04/21/2020

A New Preconditioner for the EFIE Based on Primal and Dual Graph Laplacian Spectral Filters

The Electric Field Integral Equation (EFIE) is notorious for its ill-con...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In system identification, one needs to solve linear algebraic systems: , where and are

-vectors and

is the controllability Grammian, i.e.  is the positive definite matrix that solves

(I.1)

Equation (I.1) is known as the discrete Lyapunov equation and is more properly called Stein’s equation. In (I.1), the matrix and the matrix are given. The matrix is known as the state advance matrix and the matrix is known as the input matrix. Together, is known as an input pair. We assume that is stable and that is controllable. In this case, there is an unique selfadjoint solution of (I.1) and it is positive definite [18]. We denote the solution of (I.1) as a function of and by

We study the condition number of , , where

is the largest singular value of

and is the smallest. We consider cases where the system input dimension, , is smaller than the state space dimension, . In this case, we claim that the condition number of , can be exponentially large in . Since the case is common in signal processing and system identification, our results put strong limitations on the applicability of high order arbitrary state space realizations.

A number of bounds on either or exist in the literature [20, 15, 16, 10, 19, 33]. Many of these bounds require that to be nontrivial. Theorem 2.2 of [33] can be used to bound the ratio of . (See also [32].) The existing bounds on generally make no assumptions on and therefore tend to be weak or hard to evaluate. If is real, symmetric, and stable, Penzl [30] gives a bound which we describe in Section V. For the continuous time case, interesting results on the condition number may be found in [2]

Our lower bounds on are for specific, commonly considered classes of input pairs, , such as companion matrices and normal matrices and when is a single Jordan block.

Our results are based on transforming the input pair,

, into input normal (IN) form. Input normal form implies that the identity matrix solves the discrete Lyapunov equation. Input normal pairs have special representations that allow for fast matrix-vector operations when

is a Hessenberg matrix [22, 23, 31].

In [27], a numerical simulation shows that input normal filters perform well in the presence of autocorrelated noise. We examine the condition number of the controllability Grammian when forcing term is autocorrelated. We derive a bound that explains the good performance of IN filters [27]. Other advantages of IN filters are described in [31, 28, 26]

The condition number of is related to two other well-known problems: a) the distance of an input pair to the set of uncontrollable pairs [8] and b) the sensitivity of the to perturbations in [9, 10]. It is well known that [11]. Thus we can lower bound the distance to uncontrollability by the times the sensitivity of the discrete Lyapunov equation. Our results indicate that is typically exponentially small in .

We present numerical simulations which compute the distribution of for several classes of input pairs, . When the elements of

are independently distributed as Gaussians with unit variance, our simulation shows that the ensemble average of

tends to a constant for . We observe that is approximately Gaussian. These numerical results indicate that the ill-conditioning problems of

are probably generic when

. To accurately solve (I.1), we use a novel QR iteration to precondition (I.1) and then apply a square root version of the doubling method [3].

Section II presents our computation of for an ensemble of stable controllable input pairs. Section III defines IN form and present new results on the properties of IN pairs. Section IV gives lower bounds on the condition number based on the transformation to an IN matrix and applies the bound to the case when is normal. Section V gives abstract bounds based on the ADI iteration. Section VI gives lower bounds when is in companion form. Sections VII and IX give additional bounds for normal . Section VIII gives bounds when is a single Jordan block. Section X examines condition numbers of the state covariance when the system is forced with colored noise.

Notation: Here is a

matrix with eigenvalues

ordered as and singular values, . Depending on context, is the n-vector of or the corresponding diagonal matrix. The matrix has dimension . When is stable and is controllable, we say that the input pair is CS. If is also invertible, we say is CIS. For us, ‘stable” means , sometimes known as strict exponential stability. We let denote the set of stable, controllable input pairs of dimension and . The identity matrix is denoted . The transformation matrices, denoted by and , have dimension and rank , where . The Moore-Penrose inverse of is denoted by . Here is the Frobenius norm while is any unitarily invariant matrix norm

Ii Generic Condition Number

We begin by examining the probability distribution of condition number of

as and are varied over a probability distribution, , on stable, controllable input pairs. We limit ourselves to single input pairs ().

A common class of random matrices is , with the probability measure . The Girko law [6] states that the eigenvalues of such random

are uniformly distributed on the complex disk

as . For finite , the distribution of eigenvalues is given in Theorem 6.2 of [6]. We exclude unstable and uncontrollable in our studies. We normalize by instead of

to improve the odds of obtaining stable

. Specifically, we define the distribution:

Definition II.1

Let be the probability measure induced on by letting the matrix elements and have independent Gaussian distributions, , subject to the CS restriction.

Each probability distribution on induces a distribution of and . We simulate the induced distribution by solving the discrete Lyapunov equation for 2,500 pairs chosen from the distribution .

Inaccurate numerics will tend to underestimate . Even for , these systems can be so ill-conditioned that existing numerical methods inaccurately determine the condition number. Therefore, we developed new numerical algorithms for the solution of (I.1) [25]. To solve the discrete Lyapunov equation, we use a novel square root version of the doubling method. For ill-conditioned problems, we find that preconditioning the discrete Lyapunov equation is important to accurately evaluate the condition number of [25].

Table 1 gives the quantile distribution of

as a function of for our numerical simulation with distributed in . The median condition number scales as . The interquartile distance is approximately independent of with a value of . (The interquartile distance is the distance between the th percentile and the

th percentile and is a measure of the width of the distribution.) If the distribution were normal, the interquartile distance would be roughly 1.35 standard deviations. We plotted the quantiles of

and of versus the quantiles of the Gaussian distribution. These quantile-quantile plots show that has an approximately Gaussian distribution and that

has wide tails. Naturally, the tails of the empirical distribution are more poorly determined than the median and the quartiles.

n 1% 10% 25 % 50 % 75% 90% 99 %
8 6.23 8.22 9.62 11.3 13.4 15.8 20.2
16 14.2 16.5 18.2 20.3 22.5 24.9 30.4
24 22.3 25 26.7 28.7 31.1 33.6 38

Table 1: Quantile distribution of for distributed in as a function of .

In [7]

, it is shown that a random matrix,

, has , where denotes the expected value. Thus is typically much more poorly conditioned than is. In [34], it is shown that a random lower triangular matrix, , has with probability tending to 1 as . For the median value of in our computation, the Cholesky factor of , scales as , which is nearly as badly conditioned as those in [34].

Table 1 displays results for . Empirically, we observe that the condition number grows at least as fast as . In Section IV, we derive lower bound for the condition number when is normal. We apply this bound to each matrix in our simulation. Table 3 shows that the actual condition number is worse than the normal bound by a factor of roughly 100 on average.

Iii Input Normal Pairs

In examining the condition number of solutions of the discrete Lyapunov equation, it is natural to begin with input pairs that admit solutions with condition number one.

Definition III.1

A input pair, , is input normal (IN) of grade if and only if is stable, , and

(III.1)

A matrix, , is a IN matrix of grade if and only if there exists a -matrix such that , is an IN pair. If is lower (upper) triangular as well, is a triangular input normal pair. If is Hessenberg as well, is a Hessenberg input normal pair.

In [31], ‘input normal pairs” are called orthogonal filters. In [21], ‘input normal” has a more restrictive definition of (III.1) and the additional requirement that the observability Grammian is diagonal. In our definition of ‘input normal”, we do not impose any such condition on the observability Grammian. We choose this language so that ‘normal” denotes restrictions on only one Grammian while ‘balanced” denotes simultaneous restrictions on both Grammians [17, 21]. This usage is consistent with the definitions of [5]. Input normal are generally not normal matrices.

By Theorem 2.1 of [1], if the controllability Grammian is positive definite, then the input pair is stable. In [29], Ober shows that stability plus a positive definite solution to the discrete Lyapunov equation, (I.1), implies that the input pair is controllable. Thus for IN pairs, stability is equivalent to controllability. We now show that any CS input pair may be transformed to an IN pair.

Theorem III.2

[31] Every stable, controllable input pair , is similar to a input normal pair with .

Proof:  The unique solution of (I.1), , is strictly positive definite [18]. Let be the unique Cholesky lower triangular factor of with positive diagonal entries, . We set , , and .  

Using the singular value decomposition, we have the following characterization of IN matrices:

Theorem III.3

Let be a stable matrix with , and let equal the number of singular values of less than 1, (). There is an matrix with such that and therefore is an IN matrix. The smallest singular values of satisfy , where are the eigenvalues of .

Proof:  Let be the th singular vector of and define . This constructs . The singular value identity follows from .  

For input normal pairs, this yields the bound: .

There are many similar input normal pairs since if is IN, then so is for any orthogonal . This additional freedom may be used to simplify the input pair representation [31, 23, 24].

Iv Condition Number Bounds and the Transformation to Input Normal Pairs

In this section, we derive lower bounds on the condition number of . Our bounds are based on transforming to an IN pair . The following lemma describes the transformation of solutions under a linear change of coordinates.

Lemma IV.1

Let be an matrix of rank- with and let the rows of be a basis for a left-invariant subspace of . Define by . Let be an unitarily invariant matrix norm and let be an analytic function on the spectrum of with . Then . When is invertible . Also , where is the unit vector in the first coordinate.

Proof:  Note since . We apply the bound [13, p. 211] to . When is invertible, so is and . To bound , we use the bound [13, p. 206].  

A related result in [13, p. 162] is

(IV.1)

for invertible and nonvanishing and .

When , is invertible and is similar to : . In this case (), we can reverse the roles of and in the bounds as well. The case is of interest in model reduction problems, where one projects a system onto a left invariant subspace of .

In the remainder of this section, we use and . When is input normal, we have the following bound for the condition number of the transformation of a stable matrix to input normal form.

Theorem IV.2

Let be stable and invertible and be an input normal matrix of grade , where

is an invertible matrix and

, then

(IV.2)

where are the eigenvalues of and and are the largest and smallest singular values of . For , .

Proof: By Theorem III.3, and .  

Note that Theorem IV.2 does not depend any specific input matrix .

Corollary IV.3

Let be a CIS input pair, then the condition number of satisfies the equality , where and are defined in Theorem IV.2.

Proof:  The unique solution of (I.1), , is strictly positive definite [18]. Let be the Cholesky factor of : , and set . Note .  

For normal advance matrices, , the smallest eigenvalue of . This simplifies Corollary IV.3.

Theorem IV.4

Let be a normal matrix and be a CIS input pair, then the condition number of satisfies the bound

(IV.3)

where is the eigenvalue of with the smallest magnitude and is the IN matrix generated in the map defined in the proof of Corollary IV.3. For , the lower bound simplifies to .

We compare this bound to the condition number of for an ensemble of input pairs where is a normal matrix; i.e. 

has orthogonal eigenvectors. We need to select a distribution on the set of eigenvalue

-tuples. A natural choice is the distribution induced by the random distribution of given in Definition II.1.

Definition IV.5

is the set of CS input pairs, , where is diagonal. Let be the probability measure induced from eigenvalue n-tuple distribution, of of Definition II.1 and let have the Gaussian distribution subject to the controllability restriction.

n 1% 10% 25 % 50 % 75% 90% 99 %
8 7.27 9.25 10.8 12.4 14.6 16.7 21.7
16 15.0 17.7 19.4 21.3 23.6 26.1 31.6
24 23.2 25.9 27.7 29.8 32.1 34.5 39.3

Table 2: Quantiles of as function of for . Note that has an approximately Gaussian distribution

As seen in Table 2, our numerical computations show that the distribution of for the normal matrices is virtually identical to that of our general random matrices . Again, is approximately constant with median condition number scaling as . The interquartile distance is again nearly independent of with a value of .

n 1% 10% 25 % 50 % 75% 90% 99 %
8 1.52 2.68 3.54 4.83 6.55 8.60 13.4
16 2.19 3.45 4.51 5.86 7.57 9.72 14.8
24 2.83 3.97 4.90 6.27 8.1 10.1 15.3

Table 3: Quantiles of as function of . Here is the bound given in Theorem IV.3, evaluated for each input pair.

Table 3 compares versus our theoretical bound. The discrepancy is growing only slightly in , in contrast to which is growing linearly in . A regression indicates that the median value of is growing as with . Plotting the quantiles of as a function of shows that the residual error is a weakly decreasing function of . We also observe that the spread of is almost independent of of

, perhaps indicating a heuristic model:

, where the random variable

barely depends on . To model the long tails of , an analogous model for is probably called for. We have also compared the normal bound with the log-condition number for the ensemble of random matrices in Section II. Surprisingly, the agreement with the bound is even better in this case. However, there are many cases where the condition number of a random input pair is smaller than the bound for normal matrices predicts.

The bound (IV.3) indicates that can be quite ill-conditioned. Theorems IV.2 -IV.4 do not use any property of (except controllability) nor of the complex phases of the eigenvalues, . Including this information in the bounds can only sharpen the lower bound. We believe that a significant fraction of the ill-conditioning that is not explained by our bound arises from using a random . We could replace the quantile tables with analogous ones for . If we did, we would see that our bounds better describe this quantity that the average value of .

V Alternating Direction Iteration Bounds

In this section we present condition number bounds based on the alternating direction implicit (ADI) iteration for the solution of the continuous time Lyapunov solution. These results were formulated by T. Penzl in [30]. We restate his results in a more general context.

The results for the discrete Lyapunov equation follow by applying the bilinear transform. We define . The Cayley transform corresponds to : and . The solution of the discrete Lyapunov equation (I.1) for satisfies the Lyapunov equation

(V.1)

Following [30], we define the shifted ADI iteration on (V.1). To approximately solve (V.1), we let and define by

(V.2)

where the are the shift parameters. Using the methodology of [30], we have the following bound:

Theorem V.1

In the ADI iteration of (V.2), let . The has rank and satisfies the approximation bound:

(V.3)

where . Let have a complete set of eigenvectors and the eigenvalue decomposition , then

(V.4)

We define . Thus is the best bound of the type in (V.3) The difficulty in using Theorem V.1 is finding good shifts that come close to approximating . There are algorithms for selecting shifts, but only rarely have explicit upper bounds on been given. Penzl simplified this bound for the case of real, symmetric, stable :

Theorem V.2 ([30])

Let be real symmetric and stable, and define . Then

(V.5)

Penzl’s proof is based on using a geometric sequence of shifts on the interval containing the eigenvalues of . It is difficult to determine when the bound (V.5) is stronger or weaker than the bounds in Sections IV and IX since (V.5) is independent of the precise distribution of eigenvalues while (IV.3) uses the exact eigenvalues.

The bound (V.4) shows that well-conditioned input pairs (such as input normal pairs) have and that are far from normal in the sense that is large when is small.

Vi Condition Bounds for Companion Matrices

We now specialize Corollary IV.3 to the case where the advance matrix, , is a companion matrix. Other names for this case are Frobenius normal form and Luenberger controller canonical form. The second direct form [31] and autoregressive (AR) models are special case of this type and correspond to , with being the unit vector in the first coordinate direction:

. For autoregressive models,

, while the second direct form uses to specify transfer function. Let be of the form

(VI.1)

where is a projection matrix of the form where and or . Note that corresponds to companion normal form. Here is an vector.

Autoregressive moving average (ARMA) models of degree satisfy the advance equation , where is a sequence of independent random variables with and . The ARMA model has a state space representation with the state vector , and given in (VI.1) with and . When , this is a matrix representation of the first direct form.

Lemma VI.1

Let be an matrix of the form given in (VI.1) with , then has singular values, and , that are the square roots of , where are the two roots of the equation

(VI.2)

where and are given in (VI.1) and is the th component of the vector . If , then and has a zero singular value. Otherwise, . The remaining singular values of are one with multiplicity and zero if .

For , this result is in [14]. For , is a left null vector of .

Proof:  Note , where , . To compute the eigenvalues of we define an orthogonal transformation to reduce to the direct sum of a matrix with roots given by (VI.2) and a projection matrix. Let be the Householder transformation such that , where , is the unit vector in th coordinate. We define and . Since ,

(VI.3)

If , then has a zero th coordinate. For both the and the cases, the eigenvalue equation decouples into the eigenvalues of the matrix of the first and st rows and columns and the eigenvalues of projected onto the space orthogonal to . The eigenvalue equation for the matrix is given by .  

We define and . We denote the largest root of (VI.2) by and the smallest by . Note . The bound of Corollary IV.3 reduces to

Theorem VI.2

Let be companion matrix as specified by (VI.1) and be a stable, controllable input pair with , then the condition number of satisfies the bound . If is invertible,

(VI.4)

where is the IN matrix generated in the map defined in the proof of Corollary IV.3. Here satisfies

(VI.5)

Proof:  The bound (VI.5) is proven by rewriting (VI.2) as a sequence of continued fractions

(VI.6)

Applying simple bounds to the continued fractions yields (VI.5).  

The bound in Theorem VI.2 also applies when is in companion form, corresponding to Luenberger observer canonical form. If the eigenvalues of are prescribed, the coefficients in , , are the coefficients of the characteristic polynomial of : for . When the eigenvalues of are positive real, a weaker but explicit bound is

Theorem VI.3

Let be invertible with positive real eigenvalues , then .

Proof:  We evaluate the characteristic polynomial at : , where . Note and . The bound (VI.5) implies .  

For , . When the eigenvalues of have random or near random complex phases, the value of is typically much less than , since generally , , . We now compare the bound in Theorem VI.2 with a random distribution of .

Definition VI.4

is the set of stable, controllable , where is given by (VI.1) and . The distribution is defined by the eigenvalues of having the distribution of of Definition II.1 subject to the CS restriction.

Table 4a gives the quantiles of for with random and Table 4b gives the corresponding quantiles for . The random case has a very broad distribution with the interquartile distance two to three times larger than that of generic random case. The top quartile of the random Frobenius case is as badly conditioned as the random case although the bottom quartile is much better conditioned.

n 1% 10% 25 % 50 % 75% 90% 99 %
8 2.30 4.08 5.66 8.68 12.2 15.1 21.1
16 4.10 6.71 9.59 14.6 20.1 24.2 29.6
24 5.61 8.57 12.4 19.3 27.5 32.5 38.7

Table 4a: Quantiles of distribution for for randomly distributed . The bound VI.4 significantly underestimates in many cases.

We have also computed the condition numbers when the eigenvalues are all positive: . These models are appreciably more ill-conditioned than in cases where the eigenvalues are distributed randomly in the complex plane.

This ill-conditioning corresponds to the difficulty in estimating the coefficients of a sum of decaying exponentials.

For positive , the observed ill-conditioning is usually much larger than the formula, .

The case may be interpreted as a random autoregressive model. Our distribution, , corresponds to a random distribution of poles of the autoregressive transfer function: For autoregressive models, the observability Grammian corresponds to solving (I.1) with the pair , i.e. using the characteristic polynomial coefficients as . Thus we may examine the condition numbers of both the controllability and observability Grammians.

n 1% 10% 25 % 50 % 75% 90% 99 %
8 .825 1.69 2.45 3.59 5.05 6.62 9.89
16 1.66 2.88 3.79 5.08 6.74 8.48 11.8
24 2.36 3.66 4.64 5.99 7.66 9.36 13.1

Table 4b: Quantiles of distribution for for .

The autoregressive models () are much better conditioned than those with a random righthand rank-one side. The scaling of the median of versus is ambiguous. The interquartile distance of is a weak function of . Table 4c examines the observability Grammian of the autoregressive model. We find that the corresponding observability Grammians for our autoregressive models are very poorly conditioned. Thus these autoregressive models are nearly unobservable.

n 1% 10% 25 % 50 % 75% 90% 99 %
8 14.8 22.1 27.2 35.1 46.8 61.5 89.3
16 42.2 53.9 64.4 79.6 95.5 101 109
24 72.5 92.1 101 104 108 111 117

Table 4c: Quantiles of for the observability Grammian of the autoregressive model.

The controllability is much better conditioned for the random autoregressive model than for the normal advance matrices with the same spectrum while the observability Grammian is grossly ill-conditioned in the random autoregressive model. These results may be influenced by the choice of .

Vii Power Estimate

We now show that for certain classes of input pairs, the condition number, , grows exponentially in . Specifically, let be CS with the uniform bound . The proof applies Lemma IV.1 with . The theorem below shows that , where is the integral part of .

Theorem VII.1

Let be a stable, controllable input pair, then , where .

Proof:  Let be a INizing transformation, , with being IN. Let for such that . By Lemma IV.1, . Note . The proof is completed when we prove the lemma below that .  

Lemma VII.2

Let be a IN pair, then for .

Proof:  Let be such that , then and

(VII.1)

Thus for .  

In particular, when is normal and , , the bound becomes .

Viii Jordan Block Bounds

We now bound when is a single Jordan block: , where is the lower shift matrix: . Our bound shows that for , the condition number grows as when . The proof takes the maximum of the bounds for based on Lemma IV.1

Theorem VIII.1

Let be a CS input pair with with and . The condition number of satisfies the bound

(VIII.1)

Prior to proving Theorem VIII.1, we state the bounds for . We define .

Lemma VIII.2

Let with , then .

The bound is well-known and follows from where .  

Proof of Theorem VIII.1: Let be the transformation from to an IN pair,