# Banded Matrix Fraction Representation of Triangular Input Normal Pairs

An input pair (A,B) is triangular input normal if and only if A is triangular and AA^* + BB^* = I_n, where I_n is theidentity matrix. Input normal pairs generate an orthonormal basis for the impulse response. Every input pair may be transformed to a triangular input normal pair. A new system representation is given: (A,B) is triangular normal and A is a matrix fraction, A=M^-1N, where M and N are triangular matrices of low bandwidth. For single input pairs, M and N are bidiagonal and an explicit parameterization is given in terms of the eigenvalues of A. This band fraction structure allows for fast updates of state space systems and fast system identification. When A has only real eigenvalues, one state advance requires 3n multiplications for the single input case.

## Authors

• 2 publications
• 25 publications
03/17/2018

### Orthogonal Representations for Output System Pairs

A new class of canonical forms is given proposed in which (A, C) is in H...
12/17/2019

### Computing eigenvalues of matrices in a quantum computer

Eigenproblem arises in a large number of disciplines of sciences and eng...
11/18/2021

### A fast algorithm for computing the Smith normal form with multipliers for a nonsingular integer matrix

A Las Vegas randomized algorithm is given to compute the Smith multiplie...
02/12/2020

### Eigenvalues of non-hermitian matrices: a dynamical and an iterative approach. Application to a truncated Swanson model

We propose two different strategies to find eigenvalues and eigenvectors...
02/19/2021

### Equivalences for Linearizations of Matrix Polynomials

One useful standard method to compute eigenvalues of matrix polynomials ...
03/13/2020

### Finding the closest normal structured matrix

Given a structured matrix A we study the problem of finding the closest ...
05/28/2019

### Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics

A recent strategy to circumvent the exploding and vanishing gradient pro...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 2 Triangular Input Normal Pairs

In this section, we present the fundamental representation results for the reduction to triangular normal form.

###### Definition 2.1

An input pair, , is input normal (IN) if and only if

 In−AA∗=BB∗   . (2.1)

The input pair is a TIN pair if it is input normal and is lower triangular.

IN pairs are not required to be stable or controllable. (From (2.1), must be at least marginally stable.) In [8], ‘input balanced” has a more restrictive definition of (2.1) and the additional requirement that the observability Grammian be diagonal. We do not impose any such condition on the observability Grammian. We choose this language so that ‘normal” denotes restrictions on only one Grammian while ‘balanced” denotes simultaneous restrictions on both Grammians.

Thus is an IN pair if and only if the concatenated matrix is row orthogonal. When is input normal, then the identity matrix solves Stein’s equation (also known as the discrete Lyapunov equation):

 P−APA∗=BB∗   . (2.2)

For stable , , So is input normal when or equivalently when the basis, , constitutes an orthonormal set.

###### Lemma 2.2

Let and be equivalent IN pairs, with stable and controllable. Then is unitary.

Proof:  Both and solve the Stein equation. Since is stable and controllable, the solution of Stein’s equation is unique.

###### Theorem 2.3

Every stable, controllable input pair , is similar to a lower triangular input normal pair with . The order of the eigenvalues of may be specified arbitrarily. If is real and has real eigenvalues, then and may be chosen to be real.

Proof:  Let be the unique Cholesky lower triangular factor of with positive diagonal entries: . Here is invertible. We set and . Note satisfies (2.1). By Schur’s unitary triangularization theorem (Horn and Johnson 2.3.1), there exists a unitary matrix, such that is lower triangular. The proof of the real Schur form is in Sec. 7.4.1 of [1] and Theorem 2.3.4 of [4]. The eigenvalues of may be placed in any order [4]. Clearly, and satisfy (2.1) and therefore .

We now study the number of equivalent TIN pairs:

###### Theorem 2.4

Let and be LT matrices with and let and be unitarily similar: with unitary. Let be the number of distinct eigenvalues. Partition , and into blocks corresponding to the repeated eigenvalue blocks. Let be the multiplicity of the th eigenvalue, . Then has block diagonal form: , where is a unitary matrix of size .

Proof:  From . If , then and have no common eigenvalues. By Lemma 7.1.5 of [1], . Repeating this argument shows for . By orthogonality, for . We continue this chain showing that for , etc. Proof by finite induction.

###### Corollary 2.5

Let be an matrix with distinct eigenvalues. Then is unitarily similar to triangular matrix with ordered eigenvalues and is unique up to diagonal unitary similarities: , where .

If is invertible and is TIN, then is the Cholesky factor of times a diagonal unitary matrix:

###### Theorem 2.6

Let be a TIN pair with rank , then

 A=Cholesky(In−BB∗)En , (2.3)

where is a diagonal unitary matrix. In the real case, is a signature matrix.

Proof:  Let be the Cholesky factor of . Clearly some unitary with . By the uniqueness of the LQ factorization, and are unique. Note is lower triangular and therefore diagonal.

## 3 Band Fraction Representations of Single Input Tin Pairs.

For the single input case (), TIN pairs have an explicit band fraction representation , , where and are bidiagonal. The representation is parameterized by the eigenvalues of . For each eigenvalue, , of , we define , and . We define the matrices:

 M≡bidiag(111⋯γ1γ2γ3⋯) , (3.1)
 N≡bidiag(λ1λ2λ3⋯μ1μ2μ3⋯)  , (3.2)

where the top row contains the diagonal elements of or and the bottom row contains the elements of first subdiagonal of the bidiagonal matrices, and .

###### Theorem 3.1

Let and , where and are given by (3.1- 3.2) with . Then is TIN with eigenvalues .

Proof:  Explicit evaluation shows .

The Hautus criterion states that if a stable matrix, , is nonderogatory (There is only one Jordan block for each eigenvalue of in the eigendecomposition of .), then there is a vector such that is controllable. Thus the Hautus criterion together with Theorem 2.3 imply that

###### Corollary 3.2

Let be a stable, nonderogatory matrix. Then there exists a similarity transformation, , such that has the bidiagonal fraction representation of Theorem 3.1.

The bidiagonal fraction can be casted in a more aestetic form , where and are defined as

 M0≡bidiag(c1c2c3⋯s∗1s∗2s∗3⋯) , (3.3)
 N0≡bidiag(s1s2s3⋯c1c2c3⋯)  , (3.4)

with and .

In a series of excellent articles [19, 2, 13, 17, 15, 14], a number of researchers have constructed IN filters using cascade realizations. We now consider the case single input case where has real eigenvalues. In this case, the cascade realization of the IN filter is triangular. By Corollary 2.5, there is a unique triangular state advance matrix with the given eigenvalue ordering (up to diagonal similarity transformations with elements). Thus all state space constructions of IN filters are simply different realizations of the same pair (up to sign flips of the coordinates). In Section 5, we discuss the numerics of the various realizations of IN filters.

We now give a more constructive proof of equivalence of the band fraction representation with the orthogonal basis functions of [13]. Given a set of decay rates/poles of the frequency responses, , Ninness and Gustafsson derive a set of orthonormal basis functions in the frequency domain:

 ^zn(w) = ∞∑t=0zn(t)w−t = wα√1−|λn|2w−λnn−1∏k=1(1−λ∗kww−λk), (3.5)

where is or . From this, we derive the relation:

 (w−λk)^zk(w)/√1−|λk|2=(1−λ∗k−1w)^zk−1(w)/√1−|λk−1|2  . (3.6)

for with and . Equating like powers of yields

 zk(t+1)+λ∗k−1μk−1zk−1(t+1)=λkzk(t)+μk−1zk−1(t) , (3.7)

with . This shows that the bidiagonal matrix fraction representation generates the same basis functions as in [13]. The -th component of is to be identified with the -th lead of the -th orthonormal transfer function, .

We show that if the eigenvalues are sorted in order of increasing magnitude, then does not have large elements. We now give two results concerning the bidiagonal matrix fraction representation . The first result shows that if the eigenvalues are sorted in order of increasing magnitude, then the bidiagonal factorization is well-conditioned. We then give the eigenvectors of .

A. Explicit inversion.

We can interpret the bidiagonal matrix fraction representation as an LR factorization of the augmented system: . We now explicitly invert in (3.1) and show that the numerical conditioning of the matrix inversion is good without pivoting. We define the condition number of an invertible matrix, , as , where or [1, 3]. Here is the maximum column sum norm and is the maximum row sum norm.

###### Theorem 3.3

Let be a stable TIN pair (). If the eigenvalue magnitudes are in ascending order, , then for and .

Proof:  The LR factorization is . Now is unit lower triangular with . So

 (M−1)i,j=(−1)i−j∏j≤k

for . Note and for and . Thus .   This implies that the LR factorization of is numerically stable without pivoting.

B. Eigenvectors

We now evaluate the eigenvectors of the single input TIN system using the parameterization (3.1)-(3.2) for the case of distinct eigenvalues. It is well known that the eigenvectors of triangular matrices satisfy a recursion formula. Let be the -th component of the -th eigenvector with for . Write and diag . The eigenvector equation becomes or element by element:

 λk(λ∗j−1μj−1Vj−1,k+Vjk)=(μj−1Vj−1,k+λjVjk), (3.9)

We set , for and solve the recursion for when :

 Vjk=(1−∣∣λj∣∣2)1/2(1−|λk|2)1/2λk−λj ⎡⎣∏k

where the bracketed term is equal to for .

In the next section, we show that real single input IN pairs with complex conjugate eigenvalues have a bidiagonal fraction representation as well. However, we do not give an explicit parameterization of and in terms of the eigenvalues.

## 4 Mimo Band Fraction Representations of Tin Pairs.

We now prove that generic TIN pairs have a band fraction representation: , , where and are triangular matrices of lower bandwidth and is upper triangular. The banded matrix fraction structure allows state space updates, , in multiplications using (1.3). Theorem 4.3 and Theorem 4.4 show how to parameterize TIN input pairs using parameters.

###### Theorem 4.1

Let satisfy , where is a diagonal, positive definite matrix and is LT and is an matrix. Let have nonvanishing principal subminors, for , then has an unique band fraction representation: , where and are LT matrices of bandwidth and and for .

Proof: By Theorem 3.2.1 of [1], has a unique has a unique representation where R is UT and is LT with . Let be the submatrix of containing columns through . Since A is LT, is LT of bandwidth . Note . By Theorem 4.3.1 of [1], has bandwidth . We set and

For , every controllable TIN pair has nonvanishing principal minors. The condition that have nonvanishing principal minors for is generically true. If has an LR decomposition and a vanishing principal minor for , then the induced TIN pair, , does not have a unique LR decomposition of with . From

, the singular values of

are singular values of . When the condition that is relaxed, other band fraction representations may be generated by , , where is a nonsingular diagonal matrix.

The band fraction structure implies that the matrix is upper triangular with upper bandwidth . We now examine parameterizations of the band fraction representation. We define

###### Definition 4.2

Let denote the set of all upper triangular matrices with upper bandwidth . Let be the set of such that for and with a positive first nonzero element in the th row.

Note has nonzero coordinates while is an dimensional manifold. We now show that we can construct a TIN pair from any in using the LQ decomposition.

###### Theorem 4.3

Let have rank and define and as the LQ decomposition of . Define and . Then has lower bandwidth and is invertible and is a TIN pair.

Proof:  Note rankrank. Let . Note . By Theorem 4.3.1 of [1], has bandwidth . From , is triangular and implies is a TIN pair.

We now examine the map , where . From Theorem 9.1 of [3], has nonzero principal minors for if and only if does. We can rescale and by a nonsingular diagonal matrix : and , and preserve the induced TIN pair, . This freedom and complex phase equivalence allows us to restrict our consideration to :

###### Theorem 4.4

For each TIN pair, , with nonvanishing principal minors in for , there exists a unique and a unique diagonal unitary matrix such that is generated by : .

Proof:  Let be the unique band fraction decomposition of with . For an arbitrary diagonal unitary matrix , the set of band fraction representations of is , where is nonsingular and diagonal. Suppose both and generate an equivalent version of , i.e. . Thus for some and where is nonsingular and diagonal and is diagonal and unitary. Since , the condition shows that and are positive matrices. The requirement that forces and .

We can parameterize input normal systems using for . There are two difficulties with this parameterization. First, the representations for in contain many equivalent TIN representations corresponding to different orderings of the eigenvalues of . Second, in the parameterization, the eigenvalues of are only known after one solves by computing the LQ decomposition of .

## 5 Numerics

We now discuss the computational speed of various system representations.

The bidiagonal band fraction representation, , requires only multiplications for a state vector update in a single input system. First, we compute in multiplications. We then solve the bidiagonal matrix equation in multiplications since . multiplications. The matrix operations in the band fraction advance may be implemented using the dtbmv and dtbtrs routines from the optimized LAPACK software [5]. We caution that if has nonreal eigenvalues, then and are complex. Thus the computational advantage of the bidiagonal matrix fraction representation is primarily limited to the case when the eigenvalues of are real.

In [2], Heuberger et al. propose a cascade realization of IN filters. This cascade representation requires multiplications for a state vector update in a single input system. A different cascade realization of SISO IN pairs is given in Figure 1 of [13]. The Ninness and Gustafson representation uses a lossless cascade followed by a set of AR(1) models . This realization requires at least multiplications for a state update.

The direct form representations [12] have even faster state advances, but these representations are not IN and can have very ill-conditioned Grammians [11, 14]. We advise against using high order models that are not in input normal form. The tridiagonal form [7] is another fast representation that is not input normal.

In [12, 18], embedded lossless representations are constructed. For , these embedded lossless filters multiplications per advance [12]. These embedded filters include the cost of evaluating while the IN filter representation has an additional cost of multiplications to compute , so we should credit the embedded filters with multiplications, yielding multiplications. Unfortunately, embedded lossless systems are not input normal.

In [10], we give a different representation of Hessenberg and triangular input pairs by treating as a projection of a product of Givens rotations. The multiplication count for these Givens product representations is , which for large is twice as large as the band fraction representations. If Householder transformations are used, the multiplication count is again asymptotically. The Givens product representations are a MIMO generalization of the cascade architecture. This approach may be more natural for multivariate case. In the case, the band fraction representation has the advantage that the eigenvalues of are parameters of the system. This eigenvalue parameterization is convenient when the eigenvalues are prespecified or adaptively estimated.

The Givens product representations in [10] may be less sensitive to roundoff error since they are orthogonal matrices [12]. The computational advantage of the band fraction representation is that fast numerical algorithms for band matrix inversion and multiplication are readily available.

## 6 Fast System Identification

The band fraction representations may be used for the rapid identification of impulse responses. We remain in the framework where a fixed basis of orthogonal basis functions are given as described in [12, 19, 13, 17, 15, 14] and summarized in the introduction. We then use the band fraction representation of Sections 3-4, so is given. The state vector evolves according to (1.3

). We effectively compute the second moment matrices,

and , where is the forgetting factor. The unknown coefficients, , using recursive least squares (RLS). At each time step, we reestimate by solving . This is the normal equations for the least squares estimate of . The th component of represents the th orthonormal transfer function, , applied to the sequence . The resulting estimate of the the th lead of the impulse response is . To solve for ., we recommend using a QR update of the square root of normal equations [16]. Even faster methods are available that use the displacement rank structure of [9, 16]. The interested reader can consult [6, 14] for a comprehensive description of adaptive estimation.

The input normal filter representations are advantageous for many reasons: First, constant , when the ‘true” state space model is used and the forcing noise is white. Thus the regression for is well-conditioned. Similarly, IN filter structures are resistant to roundoff error [12]. Thus IN representations will time asymptotically satisfy the ansatz need by least mean squares (LMS) identification algorithms . This leads to a second advantage of IN filters: Gradient algorithms such as the least mean squares (LMS) algorithm often perform well enough in certain applications to obviate the need for more complicated and computationally intensive RLS algorithms. Third, an update of is possible [9]. Finally, when the advance matrix is triangular, the IN pairs form nested families: is TIN for when is TIN. This nesting may be used in adaptive order selection.

These are advantages of all TIN filter representations. For single input pairs, The advantages of the band fraction representation over the cascade representation are primarily numerical, the faster computational speed ( multiplications versus or more) and the availability of fast linear algebra software for banded matrices.

## 7 Summary

We have shown that TIN pairs generically have a matrix fraction representation, , where and have lower bandwidth . This structure allows fast state space updates, , and fast solution of matrix equations: . The total operations count is multiplications. We believe that our work is the first study of system representations, where is a matrix fraction of banded matrices, .

For single input filter design, our representations are parameterized explicitly in terms of the eigenvalues, . When is bidiagonal, solving requires only multiplications and may be implemented as a systolic array with order independent latency. As showed in Theorem 3.3, if the eigenvalues are ordered in ascending magnitude, then the matrix inversion is well-conditioned and suitable for fixed point operations. For system identification, the covariance of tends to the identity matrix when the forcing noise is white. Thus least squares regression is both well-conditioned and computationally fast. Since is triangular, one can compute the evolution of all embedded models of dimension simply by projecting the model of dimension onto the first coordinates. Thus the triangular structure simplifies the evaluation of this nested family of models for model order selection.

## References

• [1] G. H. Golub and C. F. Van Loan, Matrix Computations, third edition, Baltimore: John Hopkins University Press, 1996.
• [2] P.S.C. Heuberger, P.M.J. Van den Hof, and O.H. Bosgra, ‘A generalized orthonormal basis for linear dynamical systems,” IEEE Trans. Aut. Cont., vol. 40, pp. 451-465, 1995.
• [3] N. J. Higham, Accuracy and Stability of Numerical Algorithms, Philadelphia: SIAM Press, 1996.
• [4] R. A. Horn and C. R. Johnson, Matrix analysis, Cambridge: Cambridge University Press, 1985.
• [5] E.  Anderson, et al. LAPACK User’s Guide Philadelphia: SIAM Press, 1999.
• [6] L. Ljung and T. Söderström, Theory and practice of recursive identification, Cambridge, MA: The MIT Press, 1983.
• [7] T. McKelvey and A. Helmersson, ‘State-space parameterizations of multivariate linear systems using tridiagonal matrix forms,” Proc. 35th IEEE Conf. on Decision and Control, pp. 1666-1671, IEEE Press, 1996.
• [8]

B. Moore, ‘Principal components analysis in linear systems: controllability, observability, model reduction,”

IEEE Trans. Aut. Cont., vol. 26, pp. 17-32.
• [9] A. Mullhaupt and K. S. Riedel, ‘Fast identification of innovations filters,” IEEE Trans. Signal Processing, vol. 45, pp. 2616-2619, 1997.
• [10] A. Mullhaupt and K. S. Riedel, ‘Hessenberg and Schur output normal pair representations,” submitted.
• [11] A. Mullhaupt and K. S. Riedel, ‘Bounds on the condition number of solutions of the Stein equation,” submitted.
• [12] R. A. Roberts and C. T. Mullis, Digital Signal Processing, Addison Wesley, Reading, MA, 1987.
• [13] B. Ninness and F. Gustafsson, ‘A unifying construction of orthonormal bases for system identification,” IEEE Trans. Aut. Cont. vol. 42, pp. 515-521, 1997.
• [14] B. Ninness, ‘The utility of orthonormal bases in system identification,” Technical Report 9802, Dept. of EECE, University of Newcastle, Australia, 1998.
• [15] B. Ninness, H. Hjalmarsson and F. Gustafsson, ‘The fundamental role of orthonormal bases in system identification,” IEEE Trans. Aut. Cont. vol. 44, pp. 1384-1407, 1999.
• [16] A. H. Sayed and T. Kailath, ‘A state-space approach to adaptive RLS filtering,” IEEE Signal Processing Magazine, vol. 11, pp. 18-60, 1994.
• [17] P.M.J. Van den Hof, P.S.C. Heuberger, and J. Bokor, ‘System identification with generalized orthonormal basis functions,” Automatica, vol. 31, pp. 1821-1831, 1995.
• [18] A.J. van der Veen and M. Viberg, ‘Minimal continuous state-space parametrizations,” In Proc. Eusipco, Trieste (Italy), pp.523-526, 1996.
• [19] B. Wahlberg, ‘System identification using Laguerre models,” IEEE Trans. Aut. Cont., vol. 36, pp. 551-562, 1991.