In system identification, one needs to solve linear algebraic systems: , where and are
-vectors andis the controllability Grammian, i.e. is the positive definite matrix that solves
Equation (I.1) is known as the discrete Lyapunov equation and is more properly called Stein’s equation. In (I.1), the matrix and the matrix are given. The matrix is known as the state advance matrix and the matrix is known as the input matrix. Together, is known as an input pair. We assume that is stable and that is controllable. In this case, there is an unique selfadjoint solution of (I.1) and it is positive definite . We denote the solution of (I.1) as a function of and by
We study the condition number of , , where
is the largest singular value ofand is the smallest. We consider cases where the system input dimension, , is smaller than the state space dimension, . In this case, we claim that the condition number of , can be exponentially large in . Since the case is common in signal processing and system identification, our results put strong limitations on the applicability of high order arbitrary state space realizations.
A number of bounds on either or exist in the literature [20, 15, 16, 10, 19, 33]. Many of these bounds require that to be nontrivial. Theorem 2.2 of  can be used to bound the ratio of . (See also .) The existing bounds on generally make no assumptions on and therefore tend to be weak or hard to evaluate. If is real, symmetric, and stable, Penzl  gives a bound which we describe in Section V. For the continuous time case, interesting results on the condition number may be found in 
Our lower bounds on are for specific, commonly considered classes of input pairs, , such as companion matrices and normal matrices and when is a single Jordan block.
Our results are based on transforming the input pair,
, into input normal (IN) form. Input normal form implies that the identity matrix solves the discrete Lyapunov equation. Input normal pairs have special representations that allow for fast matrix-vector operations whenis a Hessenberg matrix [22, 23, 31].
In , a numerical simulation shows that input normal filters perform well in the presence of autocorrelated noise. We examine the condition number of the controllability Grammian when forcing term is autocorrelated. We derive a bound that explains the good performance of IN filters . Other advantages of IN filters are described in [31, 28, 26]
The condition number of is related to two other well-known problems: a) the distance of an input pair to the set of uncontrollable pairs  and b) the sensitivity of the to perturbations in [9, 10]. It is well known that . Thus we can lower bound the distance to uncontrollability by the times the sensitivity of the discrete Lyapunov equation. Our results indicate that is typically exponentially small in .
We present numerical simulations which compute the distribution of for several classes of input pairs, . When the elements of
are independently distributed as Gaussians with unit variance, our simulation shows that the ensemble average oftends to a constant for . We observe that is approximately Gaussian. These numerical results indicate that the ill-conditioning problems of
are probably generic when. To accurately solve (I.1), we use a novel QR iteration to precondition (I.1) and then apply a square root version of the doubling method .
Section II presents our computation of for an ensemble of stable controllable input pairs. Section III defines IN form and present new results on the properties of IN pairs. Section IV gives lower bounds on the condition number based on the transformation to an IN matrix and applies the bound to the case when is normal. Section V gives abstract bounds based on the ADI iteration. Section VI gives lower bounds when is in companion form. Sections VII and IX give additional bounds for normal . Section VIII gives bounds when is a single Jordan block. Section X examines condition numbers of the state covariance when the system is forced with colored noise.
Notation: Here is a
matrix with eigenvaluesordered as and singular values, . Depending on context, is the n-vector of or the corresponding diagonal matrix. The matrix has dimension . When is stable and is controllable, we say that the input pair is CS. If is also invertible, we say is CIS. For us, ‘stable” means , sometimes known as strict exponential stability. We let denote the set of stable, controllable input pairs of dimension and . The identity matrix is denoted . The transformation matrices, denoted by and , have dimension and rank , where . The Moore-Penrose inverse of is denoted by . Here is the Frobenius norm while is any unitarily invariant matrix norm
Ii Generic Condition Number
We begin by examining the probability distribution of condition number ofas and are varied over a probability distribution, , on stable, controllable input pairs. We limit ourselves to single input pairs ().
A common class of random matrices is , with the probability measure . The Girko law  states that the eigenvalues of such random
are uniformly distributed on the complex diskas . For finite , the distribution of eigenvalues is given in Theorem 6.2 of . We exclude unstable and uncontrollable in our studies. We normalize by instead of
to improve the odds of obtaining stable. Specifically, we define the distribution:
Let be the probability measure induced on by letting the matrix elements and have independent Gaussian distributions, , subject to the CS restriction.
Each probability distribution on induces a distribution of and . We simulate the induced distribution by solving the discrete Lyapunov equation for 2,500 pairs chosen from the distribution .
Inaccurate numerics will tend to underestimate . Even for , these systems can be so ill-conditioned that existing numerical methods inaccurately determine the condition number. Therefore, we developed new numerical algorithms for the solution of (I.1) . To solve the discrete Lyapunov equation, we use a novel square root version of the doubling method. For ill-conditioned problems, we find that preconditioning the discrete Lyapunov equation is important to accurately evaluate the condition number of .
Table 1 gives the quantile distribution ofas a function of for our numerical simulation with distributed in . The median condition number scales as . The interquartile distance is approximately independent of with a value of . (The interquartile distance is the distance between the th percentile and the
th percentile and is a measure of the width of the distribution.) If the distribution were normal, the interquartile distance would be roughly 1.35 standard deviations. We plotted the quantiles ofand of versus the quantiles of the Gaussian distribution. These quantile-quantile plots show that has an approximately Gaussian distribution and that
has wide tails. Naturally, the tails of the empirical distribution are more poorly determined than the median and the quartiles.
|n||1%||10%||25 %||50 %||75%||90%||99 %|
Table 1: Quantile distribution of for distributed in as a function of .
, it is shown that a random matrix,, has , where denotes the expected value. Thus is typically much more poorly conditioned than is. In , it is shown that a random lower triangular matrix, , has with probability tending to 1 as . For the median value of in our computation, the Cholesky factor of , scales as , which is nearly as badly conditioned as those in .
Table 1 displays results for . Empirically, we observe that the condition number grows at least as fast as . In Section IV, we derive lower bound for the condition number when is normal. We apply this bound to each matrix in our simulation. Table 3 shows that the actual condition number is worse than the normal bound by a factor of roughly 100 on average.
Iii Input Normal Pairs
In examining the condition number of solutions of the discrete Lyapunov equation, it is natural to begin with input pairs that admit solutions with condition number one.
A input pair, , is input normal (IN) of grade if and only if is stable, , and
A matrix, , is a IN matrix of grade if and only if there exists a -matrix such that , is an IN pair. If is lower (upper) triangular as well, is a triangular input normal pair. If is Hessenberg as well, is a Hessenberg input normal pair.
In , ‘input normal pairs” are called orthogonal filters. In , ‘input normal” has a more restrictive definition of (III.1) and the additional requirement that the observability Grammian is diagonal. In our definition of ‘input normal”, we do not impose any such condition on the observability Grammian. We choose this language so that ‘normal” denotes restrictions on only one Grammian while ‘balanced” denotes simultaneous restrictions on both Grammians [17, 21]. This usage is consistent with the definitions of . Input normal are generally not normal matrices.
By Theorem 2.1 of , if the controllability Grammian is positive definite, then the input pair is stable. In , Ober shows that stability plus a positive definite solution to the discrete Lyapunov equation, (I.1), implies that the input pair is controllable. Thus for IN pairs, stability is equivalent to controllability. We now show that any CS input pair may be transformed to an IN pair.
 Every stable, controllable input pair , is similar to a input normal pair with .
Using the singular value decomposition, we have the following characterization of IN matrices:
Let be a stable matrix with , and let equal the number of singular values of less than 1, (). There is an matrix with such that and therefore is an IN matrix. The smallest singular values of satisfy , where are the eigenvalues of .
Proof: Let be the th singular vector of and define . This constructs . The singular value identity follows from .
For input normal pairs, this yields the bound: .
Iv Condition Number Bounds and the Transformation to Input Normal Pairs
In this section, we derive lower bounds on the condition number of . Our bounds are based on transforming to an IN pair . The following lemma describes the transformation of solutions under a linear change of coordinates.
Let be an matrix of rank- with and let the rows of be a basis for a left-invariant subspace of . Define by . Let be an unitarily invariant matrix norm and let be an analytic function on the spectrum of with . Then . When is invertible . Also , where is the unit vector in the first coordinate.
A related result in [13, p. 162] is
for invertible and nonvanishing and .
When , is invertible and is similar to : . In this case (), we can reverse the roles of and in the bounds as well. The case is of interest in model reduction problems, where one projects a system onto a left invariant subspace of .
In the remainder of this section, we use and . When is input normal, we have the following bound for the condition number of the transformation of a stable matrix to input normal form.
Let be stable and invertible and be an input normal
matrix of grade , where
is an invertible matrix and
is an invertible matrix and, then
where are the eigenvalues of and and are the largest and smallest singular values of . For , .
Proof: By Theorem III.3, and .
Note that Theorem IV.2 does not depend any specific input matrix .
Let be a CIS input pair, then the condition number of satisfies the equality , where and are defined in Theorem IV.2.
For normal advance matrices, , the smallest eigenvalue of . This simplifies Corollary IV.3.
Let be a normal matrix and be a CIS input pair, then the condition number of satisfies the bound
where is the eigenvalue of with the smallest magnitude and is the IN matrix generated in the map defined in the proof of Corollary IV.3. For , the lower bound simplifies to .
We compare this bound to the condition number of for an ensemble of input pairs where is a normal matrix; i.e.
has orthogonal eigenvectors. We need to select a distribution on the set of eigenvalue-tuples. A natural choice is the distribution induced by the random distribution of given in Definition II.1.
is the set of CS input pairs, , where is diagonal. Let be the probability measure induced from eigenvalue n-tuple distribution, of of Definition II.1 and let have the Gaussian distribution subject to the controllability restriction.
|n||1%||10%||25 %||50 %||75%||90%||99 %|
Table 2: Quantiles of as function of for . Note that has an approximately Gaussian distribution
As seen in Table 2, our numerical computations show that the distribution of for the normal matrices is virtually identical to that of our general random matrices . Again, is approximately constant with median condition number scaling as . The interquartile distance is again nearly independent of with a value of .
|n||1%||10%||25 %||50 %||75%||90%||99 %|
Table 3: Quantiles of as function of . Here is the bound given in Theorem IV.3, evaluated for each input pair.
Table 3 compares versus our theoretical bound. The discrepancy is growing only slightly in , in contrast to which is growing linearly in . A regression indicates that the median value of is growing as with . Plotting the quantiles of as a function of shows that the residual error is a weakly decreasing function of . We also observe that the spread of is almost independent of of
, perhaps indicating a heuristic model:
, where the random variablebarely depends on . To model the long tails of , an analogous model for is probably called for. We have also compared the normal bound with the log-condition number for the ensemble of random matrices in Section II. Surprisingly, the agreement with the bound is even better in this case. However, there are many cases where the condition number of a random input pair is smaller than the bound for normal matrices predicts.
The bound (IV.3) indicates that can be quite ill-conditioned. Theorems IV.2 -IV.4 do not use any property of (except controllability) nor of the complex phases of the eigenvalues, . Including this information in the bounds can only sharpen the lower bound. We believe that a significant fraction of the ill-conditioning that is not explained by our bound arises from using a random . We could replace the quantile tables with analogous ones for . If we did, we would see that our bounds better describe this quantity that the average value of .
V Alternating Direction Iteration Bounds
In this section we present condition number bounds based on the alternating direction implicit (ADI) iteration for the solution of the continuous time Lyapunov solution. These results were formulated by T. Penzl in . We restate his results in a more general context.
The results for the discrete Lyapunov equation follow by applying the bilinear transform. We define . The Cayley transform corresponds to : and . The solution of the discrete Lyapunov equation (I.1) for satisfies the Lyapunov equation
where the are the shift parameters. Using the methodology of , we have the following bound:
In the ADI iteration of (V.2), let . The has rank and satisfies the approximation bound:
where . Let have a complete set of eigenvectors and the eigenvalue decomposition , then
We define . Thus is the best bound of the type in (V.3) The difficulty in using Theorem V.1 is finding good shifts that come close to approximating . There are algorithms for selecting shifts, but only rarely have explicit upper bounds on been given. Penzl simplified this bound for the case of real, symmetric, stable :
Theorem V.2 ()
Let be real symmetric and stable, and define . Then
Penzl’s proof is based on using a geometric sequence of shifts on the interval containing the eigenvalues of . It is difficult to determine when the bound (V.5) is stronger or weaker than the bounds in Sections IV and IX since (V.5) is independent of the precise distribution of eigenvalues while (IV.3) uses the exact eigenvalues.
The bound (V.4) shows that well-conditioned input pairs (such as input normal pairs) have and that are far from normal in the sense that is large when is small.
Vi Condition Bounds for Companion Matrices
We now specialize Corollary IV.3 to the case where the advance matrix, , is a companion matrix. Other names for this case are Frobenius normal form and Luenberger controller canonical form. The second direct form  and autoregressive (AR) models are special case of this type and correspond to , with being the unit vector in the first coordinate direction:
. For autoregressive models,, while the second direct form uses to specify transfer function. Let be of the form
where is a projection matrix of the form where and or . Note that corresponds to companion normal form. Here is an vector.
Autoregressive moving average (ARMA) models of degree satisfy the advance equation , where is a sequence of independent random variables with and . The ARMA model has a state space representation with the state vector , and given in (VI.1) with and . When , this is a matrix representation of the first direct form.
Let be an matrix of the form given in (VI.1) with , then has singular values, and , that are the square roots of , where are the two roots of the equation
where and are given in (VI.1) and is the th component of the vector . If , then and has a zero singular value. Otherwise, . The remaining singular values of are one with multiplicity and zero if .
For , this result is in . For , is a left null vector of .
Proof: Note , where , . To compute the eigenvalues of we define an orthogonal transformation to reduce to the direct sum of a matrix with roots given by (VI.2) and a projection matrix. Let be the Householder transformation such that , where , is the unit vector in th coordinate. We define and . Since ,
If , then has a zero th coordinate. For both the and the cases, the eigenvalue equation decouples into the eigenvalues of the matrix of the first and st rows and columns and the eigenvalues of projected onto the space orthogonal to . The eigenvalue equation for the matrix is given by .
Applying simple bounds to the continued fractions yields (VI.5).
The bound in Theorem VI.2 also applies when is in companion form, corresponding to Luenberger observer canonical form. If the eigenvalues of are prescribed, the coefficients in , , are the coefficients of the characteristic polynomial of : for . When the eigenvalues of are positive real, a weaker but explicit bound is
Let be invertible with positive real eigenvalues , then .
Proof: We evaluate the characteristic polynomial at : , where . Note and . The bound (VI.5) implies .
For , . When the eigenvalues of have random or near random complex phases, the value of is typically much less than , since generally , , . We now compare the bound in Theorem VI.2 with a random distribution of .
Table 4a gives the quantiles of for with random and Table 4b gives the corresponding quantiles for . The random case has a very broad distribution with the interquartile distance two to three times larger than that of generic random case. The top quartile of the random Frobenius case is as badly conditioned as the random case although the bottom quartile is much better conditioned.
|n||1%||10%||25 %||50 %||75%||90%||99 %|
Table 4a: Quantiles of distribution for for randomly distributed . The bound VI.4 significantly underestimates in many cases.
We have also computed the condition numbers when the eigenvalues are
all positive: .
These models are appreciably more ill-conditioned than
in cases where the eigenvalues are distributed randomly in the complex plane.
This ill-conditioning corresponds to the difficulty in estimating
the coefficients of a sum of decaying exponentials.
This ill-conditioning corresponds to the difficulty in estimating the coefficients of a sum of decaying exponentials.For positive , the observed ill-conditioning is usually much larger than the formula, .
The case may be interpreted as a random autoregressive model. Our distribution, , corresponds to a random distribution of poles of the autoregressive transfer function: For autoregressive models, the observability Grammian corresponds to solving (I.1) with the pair , i.e. using the characteristic polynomial coefficients as . Thus we may examine the condition numbers of both the controllability and observability Grammians.
|n||1%||10%||25 %||50 %||75%||90%||99 %|
Table 4b: Quantiles of distribution for for .
The autoregressive models () are much better conditioned than those with a random righthand rank-one side. The scaling of the median of versus is ambiguous. The interquartile distance of is a weak function of . Table 4c examines the observability Grammian of the autoregressive model. We find that the corresponding observability Grammians for our autoregressive models are very poorly conditioned. Thus these autoregressive models are nearly unobservable.
|n||1%||10%||25 %||50 %||75%||90%||99 %|
Table 4c: Quantiles of for the observability Grammian of the autoregressive model.
The controllability is much better conditioned for the random autoregressive model than for the normal advance matrices with the same spectrum while the observability Grammian is grossly ill-conditioned in the random autoregressive model. These results may be influenced by the choice of .
Vii Power Estimate
We now show that for certain classes of input pairs, the condition number, , grows exponentially in . Specifically, let be CS with the uniform bound . The proof applies Lemma IV.1 with . The theorem below shows that , where is the integral part of .
Let be a stable, controllable input pair, then , where .
Proof: Let be a INizing transformation, , with being IN. Let for such that . By Lemma IV.1, . Note . The proof is completed when we prove the lemma below that .
Let be a IN pair, then for .
Proof: Let be such that , then and
Thus for .
In particular, when is normal and , , the bound becomes .
Viii Jordan Block Bounds
We now bound when is a single Jordan block: , where is the lower shift matrix: . Our bound shows that for , the condition number grows as when . The proof takes the maximum of the bounds for based on Lemma IV.1
Let be a CS input pair with with and . The condition number of satisfies the bound
Prior to proving Theorem VIII.1, we state the bounds for . We define .
Let with , then .
The bound is well-known and follows from where .
Proof of Theorem VIII.1: Let be the transformation from to an IN pair,