# Orthogonal Representations for Output System Pairs

A new class of canonical forms is given proposed in which (A, C) is in Hessenberg observer or Schur form and output normal: I - A^*A =C^*C. Here, C is the d × n measurement matrix and A is the advance matrix. The (C, A) stack is expressed as the product of n orthogonal matrices, each of which depends on d parameters. State updates require only O(nd) operations and derivatives of the system with respect to the parameters are fast and convenient to compute. Restrictions are given such that these models are generically identifiable. Since the observability Grammian is the identity matrix, system identification is better conditioned than other classes of models with fast updates.

## Authors

• 2 publications
• 4 publications
03/18/2022

### Existence of flipped orthogonal conjugate symmetric Jordan canonical bases for real H-selfadjoint matrices

For real matrices selfadjoint in an indefinite inner product there are t...
03/11/2018

### Banded Matrix Fraction Representation of Triangular Input Normal Pairs

An input pair (A,B) is triangular input normal if and only if A is trian...
05/30/2021

### Parallelized Computation and Backpropagation Under Angle-Parametrized Orthogonal Matrices

We present a methodology for parallel acceleration of learning in the pr...
06/27/2020

### Generic canonical forms for perplectic and symplectic normal matrices

Let B be some invertible Hermitian or skew-Hermitian matrix. A matrix A ...
11/18/2021

### A fast algorithm for computing the Smith normal form with multipliers for a nonsingular integer matrix

A Las Vegas randomized algorithm is given to compute the Smith multiplie...
08/04/2021

### Gohberg-Kaashoek Numbers and Stability of the Schur Canonical Form

In the present paper, we characterize the stability of the Schur canonic...
06/16/2015

### Spectral Sparsification and Regret Minimization Beyond Matrix Multiplicative Updates

In this paper, we provide a novel construction of the linear-sized spect...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Canonical forms are important in system identification, where an unique representation is desired to avoid identifiability problems [11, 19]. We consider the system , where is a real matrix, is a real matrix, and is real matrix with . Our systems are output normalized:

 A∗A=In−C∗C   . (1.1)

In the next section, we describe the advantages of using an input normal or an output normal (ON) representation.

We consider real stable output pairs and show that a real output pair representation always exists where is simultaneously output normal and in Hessenberg observer form or real Schur form. We give explicit parameterizations of the stack as a product of orthogonal matrices of the form:

 (CA)=[nd∏i=1Gj(i),k(i)(θi)]1:(n+d),1:n (1.2)

and related variants. Here is a Given’s rotation in as defined at the end of this section. Our representations include the banded orthogonal filters of [18] as a special case (under the duality map , ).

These orthogonal product representations are parameterized by the minimal number of free parameters and have no coordinate singularities. Our representation allow fast state updates in operations and derivatives of the system with respect to the parameters are fast and convenient to compute.

Our results consider only the output pair, , and are independent of . Thus and may be treated as linear parameters in system identification or system synthesis (in contrast to (2.5)) and chosen separately from the parameters of and . In particular, the elements of

may be estimated with pseudo-linear regression. Corresponding controller representations exist for input pairs,

.

Any stable observable output pair may be transformed into one of our representations by the following three step process. First, we transform the output pair to output normal form using the Cholesky factor of the solution of (2.2). Second, we orthogonally transform the output normal pair to any of the three major output forms: Schur form, Hessenberg observer form, and observer triangular system form as defined in Section 3. Finally, we perform a series of Givens rotations to show that the transformed system must be of the form given by (1.2).

The final representations are given in Theorems 6.1, 7.1. For statistical estimation and numerical implementations, it is highly desirable to eliminate redundancy in the parameterization when possible. We address redundancy in two ways. First, we categorize when two distinct ON pairs in Hessenberg observer form are equivalent. Second, we impose constraints on the parameters in (1.2) to eliminate redundant parameterizations of the same ON pair generically. We repeat this analysis for Schur form and for observer triangular system form.

In Section 2, we give a brief overview of the advantages of input normal and output normal form. In the Section 3, we give the basic definitions and show every output pair is similar to an output pair in Schur ON form, to a Hessenberg ON pair and to an ON pair in triangular system form . In Section 4, we show that after standardization these Hessenberg ON systems uniquely parameterize transfer functions for generic systems. When is reducible, we find an orthogonal transformation that preserve the output normal property. To construct the orthogonal product representations of the stack, we need families of orthogonal matrices such as the set of Householder matrices. In Section 5, we give a general definition of orthogonal reduction families that includes Householder and Givens representation. In Section 6 and Section 7, we give explicit orthogonal product representations of Hessenberg output normal pairs.

Notation: The identity matrix is and

is the unit vector in the

th coordinate. By , we denote the subblock of from row to row and from column to column . We abbreviate by . The matrix has upper bandwidth if when . A matrix of zeros is denoted by . The direct sum of matrices is denoted by . We denote the matrix transpose of by with no complex conjugation since we are interested in the real system case.

We denote the Given’s rotation in the th and th coordinate by i.e. , and otherwise, where are the elements of . The symbol denotes a signature matrix: .

Two systems and are similar (equivalent) when , and for some invertible . They are orthogonal equivalent if

is a real orthogonal matrix.

## 2 Representations and Condition Numbers

The goal of this paper is to propose system representations that are both well conditioned for system identification and are fast and convenient for numerical computation. We briefly discuss these issues in the context of existing alternative system representations. For more complete analysis of conditioning in system identification, we refer the reader to [16].

Let be stable, observable and controllable. We define the observability Grammian, and the controllability Grammian, by

 PA,B−APA,BA∗=BB∗ (2.1)
 PA∗,C∗−A∗PA∗,C∗A=C∗C . (2.2)

A popular class of system representations is balanced systems [12, 19, 20], where both the observability Grammian and the controllability Grammian are simultaneously diagonal: . Balanced representations have many desirable theoretical properties. However, existing parameterizations of balanced models require operations to update the state space system.

An alternative to balanced models is output normal (ON) representations [13, 14], where the observability Grammian is required to be the identity matrix, but no structure on the controllability Grammian.

###### Definition 2.1

An output pair, , is output normal (ON) if and only if (1.1) holds. An input pair, , is input normal (IN) if and only if

 AA∗=In−BB∗   . (2.3)

If is stable, definition 2.1 is equivalent to for output normal and for input normal. In [19], Ober shows that stability plus a positive definite solution to the dual Stein equation, (2.2), implies that the output pair is observable. By Theorem 2.1 of [2], if the observability Grammian is positive definite and is observable, then the output pair is stable. Thus for ON pairs, stability is equivalent to observability.

ON pairs are not required to be stable or observable. (From (1.1), must be at least marginally stable.) In [12], ‘output normal” has a more restrictive definition of (1.1) and the additional requirement that the controllability Grammian be diagonal. We do not impose any such condition on the controllability Grammian. In [13], we called condition (1.1) ‘output balanced”, whereas now we call (1.1) ‘output normal.” We choose this language so that ‘normal” denotes restrictions on only one Grammian while ‘balanced” denotes simultaneous restrictions on both Grammians.

A measure of ill-conditioning in system identification is the condition number of ,

largest singular value of

divided by the smallest. In [16], we show that solving the Stein equation, is exponentially ill-conditioned in for large classes of pairs; i.e.  for some . To avoid the possibility of ill-conditioning, we prefer to consider representations where either the observability or the controllability Grammian is the identity.

Let be the Grammian of the balanced system equivalent to . In [9], it is shown that

 κ(ΣA,B,C)2≤κ(PA,B)κ(PA∗,C∗) , (2.4)

where equality holds for balanced systems, input normal systems and output normal systems. For output balanced systems, the ill-conditioning is entirely in the controllability Grammian: . We interpret as the intrinsic conditioning of a linear time invariant (LTI) system and as a measure of the excess ill-conditioning of a system representation.

Our representations resemble those based on embedded lossless systems [5, 22, 23]:

 (ABCD)=P1f1∏i=1Gk(i),m(i)(θi)P2f2∏j=f1+1Gk(j),m(j)(θj)P3 , (2.5)

where number of free parameters, and are projections onto coordinate directions and is a prescribed permutation. In [23], the full system is first embedded in a lossless system (just as we transform the output pair to output normal form). Next, these authors transform to Hessenberg controller form (analogous to our transformation to Hessenberg observer form). We conjecture that there are analogous versions of (2.5), where is in Schur form or is in controller triangular system form. Finally, the authors perform a series of Givens rotations to show that the transformed system must be of the form given by (2.5). Our corresponding representations are given in Theorems 6.1, 7.1.

The main advantage of (1.2) over (2.5) is that the observability Grammian of ON models does not inflate the product condition number: . A second advantage is that and may be treated as linear parameters in system identification or system synthesis, whereas (2.5) couples the parameterization of and to that of and in a nonlinear fashion. For these reasons, we recommend output normal representations over embedded lossless representations.

Another difference between our treatment and the analyses of [5, 22, 23] is that we try to impose constraints on the parameters to eliminate redundant representations whenever possible and to categorize when redundant representations can occur. If one is satisfied with having representations with a finite multiplicity of equivalent systems (at least generically), this last step may be too detailed. For numerical implementations, we believe that it is highly desirable to eliminate as much of the redundancy in representation as is possible.

Our representations include the banded orthogonal filters of [18] as a special case. Our analysis imposes additional constraints on representations of [18] to remove multiple representions of the same transfer function generically.

## 3 Definitions and Existence

We now define observer triangular system form, Schur form and Hessenberg observer form and show that any stable observable output pair is equivalent to an output normal pair in any of these three forms. We denote the matrix stack of and by :

 Q≡(CA) . (3.1)
###### Definition 3.1

The output pair is in observer triangular system (OTS) form if the stack, , satisfies for . The output pair is unreduced if and is reducible if for some . The output pair is in standard OTS form if and in strict OTS form if .

Thus strict is equivalent to unreduced and standard. The real Schur representation is defined and described in [4, 6, 7]. The diagonal subblocks of may be placed in an arbitrary order. To ensure identifiability of our model, we must specify a particular standardization of the diagonal of the Schur form of . Our choice, ‘ordered qd” Schur form, is defined in Appendix A.

The OTS form includes the banded orthogonal filters of [18] as a special case under the duality map , . Our results correspond to a detailed analysis of the generic identifiability of the representations of [18].

Hessenberg observer (HO) form is a canonical form where is Hessenberg. We impose the additional restriction that , for .

###### Definition 3.2

The output pair is in Hessenberg observer (HO) form if is a Hessenberg matrix and for . A HO output pair is nondegenerate if . A HO output pair is unreduced if for and . A HO output pair is standard if for , . A HO output pair is strict if it is unreduced and standard. A HO output pair is in partial ordered Schur qd block form if implies is in ordered Schur qd block form.

Both Hessenberg observer output pairs and observer triangular system output pairs always can be transformed to a standard output pair using a signature matrix, : . Generically, HO output pairs are unreduced and thereby unaffected by the requirement of partial Schur order. For both OTS form and HO form, the matrix is unspecified. Dual definitions for controller forms reverse the roles of and . In our definitions, for , a OTS output pair has a lower Hessenberg matrix.

An important result in systems representation theory is

###### Theorem 3.3

[10, 24] Any observable output pair is orthogonally equivalent to a system in real Schur form, to a system in observer triangular system form and to a system in Hessenberg observer form. The Hessenberg observer form can be chosen in partial ordered Schur qd block form.

The standard proof of Theorem 3.3 begins by transforming to its desired form and then defines Householder or Givens rotations which zero out particular elements in in successive rows or columns [7].

###### Definition 3.4

An output pair, , is observer triangular system output normal (OTSON) if it is in observer triangular system form and output normal. The output pair is Hessenberg observer output normal (HOON) if it is in Hessenberg observer form and output normal. The output pair is in Schur ON form if it is output normal and is in real Schur form.

###### Theorem 3.5

Every stable, real observable output pair , is similar to a real OTSON pair, to a real HOON pair, and to an ordered real Schur output pair with qd diagonal subblocks.

Proof:  The unique solution, , of dual Stein equation, (2.2), is strictly positive definite. Let be the unique Cholesky lower triangular factor of with positive diagonal entries: . We set . Let be orthogonal transformation that takes to the desired form (Schur, OTS or HO) as described in [10, 24]. Then is the desired transformation.

This result applies to any output pair with a positive define solution to the dual Stein equation, (2.2). Observability and stability of are sufficient but not necessary conditions for a positive definite solution.

Degenerate HOON pairs correspond to the direct sum of an identity matrix and a nondegenerate HOON system:

###### Lemma 3.6

Every stable, real observable output pair , is similar to a real HOON pair with a stack of the form for some , where is a nondegenerate HOON stack.

Thus we consider only HOON systems that are nondegenerate. Note that degenerate Hessenberg controller forms are excluded from [23] by their assumptions. If the HO pair is reducible, then it may be further simplified using orthogonal transformations as described in Theorem 4.4.

## 4 Uniqueness of Strict Hoon and Otson Representations

There are two main ways in which one of our system representations can fail to parameterize linear time invariant systems in a bijective fashion. First, there may be a multiplicity of equivalent HOON systems (or OTSON systems or Schur OB systems). Second, Givens product representation such as (1.2) may have multiple (or no) parameterizations of the same output pair.

For Schur OB pairs, the basic result is straightforward. If

has distinct eigenvalues and they are ordered in an unique fashion, then there is a parameterization that is globally bijective.

Each strict OTSON (HOON) pair generates distinct but equivalent OTSON (HOON) pairs using different signature matrices. If the OTS pair or the HO pair is reducible, then it may be further simplified using orthogonal transformations. For the HO pair, these reductions are described in Theorems 4.4. The representations of degenerate HO pairs reduce to a direct sum of a ‘nondegenerate” HOON system and a trivial system and thus we consider only nondegenerate HOON pairs.

For OTSON pairs and nondegenerate HOON pairs, we find that the set of strict output pairs has a bijective representation in an easy to parameterize subset of Givens product representations. Our precise OTSON result is

###### Theorem 4.1

If is a strict OTSON pair, then there are no other equivalent strict OTSON pairs.

This result and a generalization that reducible OTSON pairs is proven in [17]. For HOON representations, our uniqueness results are based on the following lemma that generalizes the Implicit Q theorem [4, 7] to HOON pairs:

###### Lemma 4.2

Let and be equivalent standard nondegenerate HOON pairs (). Let , and for , then , where is an orthogonal matrix. Furthermore, and .

Since , , . The result follows from the Implicit Q theorem [4, 7].

###### Corollary 4.3

If is a strict nondegenerate HOON pair, then there are no other equivalent strict HOON pairs.

For reducible HOON pairs, we place the lower part of in ordered Schur qd block form to remove redundant representations:

###### Theorem 4.4

Let be a nondegenerate HOON pair with with for and define . There exists an equivalent HOON pair , (), where is in partial ordered Schur qd block form. If has distinct eigenvalues, is uniquely defined.

Proof:  By results cited in Appendix A, there exists an orthogonal transformation, , such that is in partial ordered Schur qd block form. From Lemma 4.2, is the desired transformation and it is unique when has distinct eigenvalues.

## 5 Orthogonal Families

We rewrite (1.1) as , where is the matrix stack. Thus is the first columns of the product of orthogonal matrices. We parameterize each of the matrices with parameters for a Householder transformation or Givens rotations. We denote the group of orthogonal matrices by .

Our basic building block is a dimensional parameterization of these orthogonal reduction transformations. Here is the -dimensional parameter vector.

###### Definition 5.1

An orthogonal reduction parameterization (ORP) of to is a dimensional family of orthogonal matrices such that for every vector, , there exists an unique such that . A family of orthogonal matrices is an unsigned orthogonal reduction parameterization (ORP) of to if for every vector, , there exists an unique such that is in the direction.

Unsigned ORPs require that while standard ORPs require that . The th column of is equal to . Thus may be determined by the th column of . For OTSON representations, we will use ORPs of to . For HOON representations, we will use ORPs of to and ORPs of to .

The traditional vector reduction families are the set of Householder transformations and families of Given’s transformations. For ORPs from to , the two traditional Given’s ORPs are

 ~Q1={~Q(θ)=G1,d+1(θd)G1,d(θd−1)⋯G1,2(θ1)} , (5.1)
 ~Q2={~Q(θ)=Gd,d+1(θd)Gd−1,d(θd−1)⋯G2,3(θ2)G1,2(θ1)} . (5.2)

For both and , we restrict the Givens angles: for and . The rightmost Givens rotation has twice the angular domain since it is used to make positive.

Let be a ORP from to with the block representation:

 ~Q(θ)=(μy∗x~O) , (5.3)

where is a scalar and and are -vectors. The orthogonality of implies , , and . Thus is invertible if .

We embed in the space of matrices.

 Q(k)(θ)=Ik−1⊕⎛⎜ ⎜⎝μk01,n−ky∗k0n−k,1In−k0n−k,dxk0d,n−k~Ok⎞⎟ ⎟⎠ , (5.4)

where , , and are subblocks of (5.3). For the Givens rotations of class , we have

 Q(k)(θ)=Gk,n+d(θd)Gk,n+d−1(θd−1)⋯Gk,n+1(θ1) . (5.5)

An ORP from to is

 ~Q3={~Q(θ)=G1,2(θ1)G2,3(θ2)⋯Gd−2,d−1(θd−2)Gd−1,d(θd−1) } , (5.6)

where now the angular restrictions are for and .

## 6 OTSON representations.

The key to our OTSON representation is the recognition that the stack is column orthonormal. These results include the representation of [18] as an important special case. Our fundamental representation for OTSON pairs is

###### Theorem 6.1

Every real OTSON pair has the representation:

 (CA)=Q(n)(θn)Q(n−1)(θn−1)…Q(2)(θ2)Q(1)(θ1)(In0d,n) (6.1)

for some set of -vectors . Here the are given by (5.4) , where , , and are subblocks of a ORP of to .

We successively determine , , . At the th stage, is determined to zero out the of the nonzero entries in the th column. By orthogonality the other entries in the th row must be zero.

Proof:  We determine so that . By orthonormality, . Let be the stack and set

 Ω(k)(θk,θk+1,…,θn)≡Q(k)∗(θk)Q(k+1)∗(θk+1)…Q(n)∗(θn)(CA) . (6.2)

Assume that has its last columns satisfying . Since has orthonormal columns, . Select such . Then for and therefore the last columns satisfy .

For the Givens rotations of class , we have

 (CA)=Gn,n+d(θn,d)Gn,n+d−1(θn,d−1)⋅Gn,n+1(θn,1)⋅⋅G1,n+d(θ1,d)⋅G1,n+1(θ1,1)(In0d,n) (6.3)

We now show that every matrix of the form given in the righthand side of (6.1) is a OTSON matrix. We define

 Γ(k)≡Q(k)(θk)Q(k−1)(θk−1)…Q(1)(θ1) . (6.4)
###### Lemma 6.2

Let have the structure given by (5.3) and (5.4), then has the structure:

 Γ(k)≡⎛⎜⎝Lk0k,n−kNk0n−k,kIn−k0n−k,dMk0d,n−kPk⎞⎟⎠ , (6.5)

where is a lower triangular matrix and the following recurrence relations hold:

 Lk = (Lk−10y∗kMk−1μk)  ,  Nk=(cNk−1y∗kPk−1) (6.6)
 (6.7)

Proof:  Assume (6.5) for and multiply .

Lemma 6.2 does not use the fact that is orthogonal. Lemma 6.2 is a special case of a more general theory of matrix subblock products [15].

The last rows of may be rewritten as

 (6.8)

Lemma 6.2 implies that is lower triangular and thus corresponds to the stack of an observer Hessenberg system stack:

###### Corollary 6.3

Every stack of the form (6.1) is a OTSON pair when the are orthogonal matrices satisfying (5.4).

A parameterization of state space models is identifiable when only one parameter vector corresponds to each transfer function; i.e. the map from parameters to input-output behavior is injective. We now show that the mapping between standard OTSON pairs and orthogonal product representation given in Theorem 6.1 is one to one and onto.

###### Theorem 6.4

Let each be an embedding of a ORP of to as given by (5.3). Then there is a one to one correspondence between strict OTSON pairs and the orthogonal product parameterization of Lemma 6.2 with . There is a one to one correspondence between unreduced OTSON pairs and the orthogonal product parameterization restricted to .

Proof:  Theorem 6.1 shows that every OTSON pair has such a representation. From (6.7), we represent the last rows of as . For , may be determined and inverted and is determinable from . We let vary over . Thus the mapping of OTSON pairs into the product ORP representation is onto.

For our parameterization of output pairs to be truly identifiable, we need to restrict our parameter space, , such that no two output pair representations, and , are equivalent. We prefer to restrict our parameterizations to . This set has redundant representations only when at least one .

## 7 Hessenberg Observer Output Normal Form

In this section, we give representation results for HOON pairs. The first row of satisfies , for . We do not transform this row and treat as a free parameter. We use Givens rotations to zero out the lower diagonal of and the row through row of . For each column of the stack, we use Givens rotaions except for the final column wich requires only .

We embed orthogonal reduction parameterizations of to into the space of matrices. We define in dimensional matrices :

 V(k)(θ)=⎛⎜ ⎜⎝~Ok0d,k−1xk0k−1,dIk−10k−1,1y∗k01,k−1μk⎞⎟ ⎟⎠⊕In−k−1 , (7.1)

for . Here , are -vectors. Thus alters only the rows and row . We require that the orthogonal matrix,

 ~V(θ)≡(~Okxky∗kμk) (7.2)

be a member of a ORP from to . For , we define.

 V(n)(θ)=~Vd(θ)⊕In−1 , (7.3)

where is a ORP from to . Thus are -vectors while is a -vector. Our parameterization of HOON pairs uses a scalar, and . We denote the bottom rows of by .

###### Theorem 7.1

Every real nondegenerate HOON pair has the representation:

 (^CA)=V(1)(θ1)V(2)(θ2)…V(n−1)(θn−1)V(n)(θn)(0d−1,nP(γ)) (7.4)

for some set of parameters, , with and . Here is the scaled permutation matrix: , for , , and otherwise. The are defined in (7.1)-(7.3) and are members of the appropriate ORPs.

Proof:  Let be the stack and set

 Ω(k)(θ1,θ2,…,θk)≡V(k)∗(θk)V(k−1)∗(θk−1)…V(1)∗(θ1)(^CA) . (7.5)

Assume that has its first columns satisfying , where and for . Since has orthonormal columns, . Select such . Then for , and therefore the first columns satisfying .

For and , (7.4) is the well-known expression of an unitary Hessenberg matrix as a product of Givens rotations [1]. To show that every matrix of the form given by the righthand side of (7.4) is a HOON pair, we define

 X(k)(θ1,θ2,…,θk)≡V(1)(θ1)V(2)(θ2)…V(k)(θk) . (7.6)
###### Lemma 7.2

Let have the structure given by (7.1) - (7.3), then has the structure:

 X(k)≡(NkHk)⊕In−k−1 , (7.7)

where is and is a upper triangular matrix and the following recurrence relations hold:

 Nk=(Nk−1~Oky∗k)  ,  Hk = (Hk−1Nk−1xk01,k−1μk) . (7.8)

This result follows from multiplying out the matrix product.

###### Corollary 7.3

Let have the structure given by (7.1) - (7.3) and let be the scaled permutation matrix. The righthand side of (7.4) defines an HOON pair with and .

###### Theorem 7.4

Under the definitions of Theorem 7.1, there is a one to one correspondence between strict HOON pairs and the parameterization of Theorem 7.1 restricted to .

Proof:  From Lemma 7.2, and for . The first rows of as . Thus we can determine from and . The proof is now identical to the proof of Theorem 6.4.

## 8 Discussion

For each of the three output pairs, Schur ON form, observer triangular system ON and Hessenberg observer ON pairs, we have examined the uniqueness/identifiability of the representation in Section 4. We then express each of these output pairs in terms of an orthogonal product representation (OPR) as the product of orthogonal matrices involving a total of parameters (Theorems 6.1, 7.1). A similar representation is possible for Schur ON form [17]. We have shown how to place restrictions on the parameters such that the orthogonal product representations are in one to one correspondence with sets of generic transfer functions. For OTSON and HOON representations, we recommend restricting the Given’s rotations in Theorems 6.1 and 7.1 such that . This set has redundant representations only when at least one .

In practice, these orthogonal product representations are implemented with either Given’s rotations or Householder transformations. Our definition of ORPs allows us to treat all the standard cases similarly. We do not explicitly store or multiply by or . Instead we store only the Given’s or Householder parameters and we perform the matrix multiplication implicitly. For an -vector , we compute and using the orthogonal product representation.

These orthogonal product representations have several advantageous properties:

1) is easy to compute.

2) Vector multiplication by and by require and operations, where is the stack.

3) Observability and stability are equivalent and is automatically satisfied

4) The controllability matrix, , may be parameterized by its elements, , separately from the parameters of .

5) The observability Grammian is perfectly conditioned.

The final advantage is key for us. Many of the other well-known representation are very ill-conditioned [16]. A measure of the conditioning of a representation is the product of the condition number of the observability Grammian and the condition number of the controllability Grammian. As discussed in [16], balanced, input normal and output normal representations minimize this product of the condition numbers.

The fast filtering methods of [24] may be further sped up when or has the orthogonal product representations of this article. To transform a specific output pair to ON form, the dual Stein equation must be solved. The numerical conditioning of this problem can be quite poor [21, 16].

Which orthogonal product representation is most appropriate for my problem? Schur ON representations naturally display the eigenvalues of while the spectrum of must be numerically calculated when is OTSON or HOON. If the parameterization evolves in time, the form of the Schur representation changes when eigenvalues coalesce and the block structure of changes. Thus, for evolving representations, we prefer the OTSON and HOON representations. It is straightforward to impose the restrictions that in the OTSON form. If the problem requires derivatives of and , the Givens rotation parameterization of ORPs is usually simpler than Householder reflections.

In summary, these orthogonal product representations offer the best possible conditioning while having a convenient representation with fast matrix multiplication. Corresponding controller representations exist for input pairs, , that are input normalized.

## 9 Appendix A: Specifying the Real Schur Form

The real Schur representation is defined and described in [4, 6, 7]. We denote the number of complex conjugate pairs of eigenvalues by and the number of real eigenvalues by . Let for and for with , and define . The Schur form is

 ⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝Z1R1,2R1,3…R1,M0Z2R2,3⋱R2,M0…⋱⋱⋮⋮…0ZM−1RM−1,M0…00ZM⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠ , (9.1)

where are matrices for and real scalars for . Here . Thus we explicitly require the complex conjugate eigenvalues to be placed ahead of the real eigenvalues for a matrix to be in Schur form. For identifiability, we need to uniquely specify the order of the blocks and the form of each block. Let be the eigenvalues of with being an eigenvalue of and being an eigenvalue of for .

###### Definition 9.1

Let be in real Schur form, (9.1), with ordered eigenvalues as described above. Then is in ordered Schur form if 1) for and for ; 2) If , then for and for ;

Definition 9.1 can be replaced by any other complete specification of the eigenvalue block order. Note that may be transformed by a product of Givens rotations: and still stay in Schur form. For identifiability, we also need to specify the form of each diagonal subblock. Let denote a diagonal subblock of a Schur :

 Zj=(z11z12z21z22) . (9.2)

A common standardization of is to require , with and . We refer to this standardization of the two by two subblocks as block form since , the real part of the eigenvalues. The form is also known as standardized form [3].

###### Theorem 9.2

Let and be matrices in real Schur form with ordered eigenvalues. Let and be orthogonally similar: with orthogonal. Let be the number of distinct eigenvalue pairs plus the number of distinct real eigenvalues. Partition , and into blocks corresponding to the repeated eigenvalue blocks. Then has block diagonal form: , where is orthogonal.

Proof:  From . If , then and have no common eigenvalues. By Lemma 7.1.5 of [7], . Repeating this argument shows for . By orthogonality, for . We continue this chain showing that for , etc. Proof by finite induction.

When has the distinct eigenvalues the block decomposition if precisely that of (9.1). When has eigenvalue with multiplicity greater than one, the block decomposition groups the repeated eigenvalue blocks together. In the repeated eigenvalue case, not every block orthogonal transformation, , preserves the Schur form. We use the freedom of the orthogonal blocks to standardize the diagonal of the real Schur form:

###### Corollary 9.3

Let be a matrix with distinct eigenvalues. The is orthogonally similar to a matrix, , in ordered real Schur form. and is unique up to diagonal unitary similarities: , where