 # A Rank-Preserving Generalized Matrix Inverse for Consistency with Respect to Similarity

There has recently been renewed recognition of the need to understand the consistency properties that must be preserved when a generalized matrix inverse is required. The most widely known generalized inverse, the Moore-Penrose pseudoinverse, provides consistency with respect to orthonormal transformations (e.g., rotations of a coordinate frame), and a recently derived inverse provides consistency with respect to diagonal transformations (e.g., a change of units on state variables). Another well-known and theoretically important generalized inverse is the Drazin inverse, which preserves consistency with respect to similarity transformations. In this paper we note a limitation of the Drazin inverse is that it does not generally preserve the rank of the linear system of interest. We then introduce an alternative generalized inverse that both preserves rank and provides consistency with respect to similarity transformations. Lastly we provide an example and discuss experiments which suggest the need for algorithms with improved numerical stability.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Many analytical and practical problems require a solution to an underdetermined or overdetermined system of equations in the form of a generalized matrix inverse of a singular matrix as satisfying

 AA∼\tiny-1A = A, (1) A∼\tiny-1AA∼\tiny-1 = A∼\tiny-1. (2)

In addition to these properties, a practical application may require the system to be consistent with respect to a general class of transformations. For example, if the behavior of the system should be invariant with respect to arbitrary rotations of the Cartesian coordinate frame in which it is defined then the appropriate generalized inverse should also satisfy

 (UAV)∼\tiny-1 = V∗A∼\tiny-1U∗ (3)

where and are arbitrary rotation matrices (or, more generally, arbitrary orthonormal/unitary matrices) such that , where is the conjugate-transpose of . This condition is satisfied by the Moore-Penrose inverse (MP inverse), which we will denote as . In other applications the behavior of the system of interest is expected to be invariant with respect to changes of units defined for the its key state variables/parameters. For example, if all variables are defined consistently in imperial units then a change to metric units should have no effect on the behavior of the system, i.e., the system should be invariant to the arbitrary choice of units. This condition can be expressed as

 (DAE)∼% \tiny-1 = E\tiny-1A∼\tiny-1D\tiny-1 (4)

where and are arbitrary nonsingular diagonal matrices. This condition is satisfied by a recently-derived unit-consistent (UC) generalized inverse .

An unfortunate fact is that in many scientific and engineering domains the MP inverse is widely used as the default generalized inverse without consideration for the properties it preserves – and does not preserve. Examples include some forms of Channel Estimation and some applications of the Relative Gain Array (RGA). In particular, the RGA is used to measure process variable interactions defined by a matrix and is defined as 

 RGA(A) ≐ A∘(A% \tiny-1)\scriptsize T. (5)

where represents the elementwise Hadamard matrix product. When is singular (or non-square) the RGA is often evaluated in practice as 

 A∘(A\tiny-P)\scriptsize T (6)

without regard for whether the relative magnitudes of interactions should be invariant to the choice of units on input/output variables. In many applications involving singular system matrices (likely most) unit consistency is needed and thus the RGA should be evaluated using the UC inverse as

 A∘(A\tiny-U)\scriptsize T (7)

Failure to do so can lead to erroneous pairings of controlled and manipulated variables due to inappropriate dependencies on the choice of units.

In other applications there may be need for an generalized inverse to be consistent with respect to an arbitrary matrix similarity transformation:

 (SAS\tiny-1)∼\tiny-1 = SA∼\tiny-1S\tiny-1 (8)

The need for this condition can be seen from a standard linear model

 ^y = A⋅^\uptheta (9)

where the objective is to identify a vector

of parameter values satisfying the above equation for a data matrix and a known/desired state vector defined in the same state space as . If is nonsingular then there exists a unique which gives the solution

 ^\uptheta = A\tiny-1⋅%$^y$ (10)

If is singular, however, then the Moore-Penrose inverse could be applied as

 ^\uptheta = A\tiny-P⋅%$^y$ (11)

or the UC inverse could be applied as

 ^\uptheta = A\tiny-U⋅%$^y$ (12)

to obtain a solution. Now suppose the state space of and

is linearly transformed by a nonsingular matrix

as

 ^y′ = S^y (13) ^\uptheta′ = S^\uptheta (14)

where the transformation could, for example, define tangent-space coordinates for a robot moving on a non-planar manifold or, similarly, for gradient descent in a given function space. Eq.(9) can then be rewritten in the new coordinates as

 ^y′ = S^y = (SAS\tiny-1)⋅S^\uptheta = (SAS\tiny-1)⋅^\uptheta′ (15)

but for which

 S^\uptheta ≠ (SAS\tiny-1% )\tiny-P⋅^y′ (16) S^\uptheta ≠ (SAS\tiny-1% )\tiny-U⋅^y′ (17)

In other words, the solution in the transformed space is not the same as the original solution transformed to the new space – a situation that represents failure of a fundamental sanity check. This is because the MP inverse and UC inverse do not satisfy similarity consistency.

Although the similarity-consistency (SC) condition of Eq. (8) can be satisfied using the Drazin inverse , , its use is not an option for most engineering-related systems. Specifically, although the Drazin inverse does have some practical applications ([3, 10]), its use is severely limited because may have rank less than that of , which is often problematic because it implies subspace information can be progressively and irretrievably lost. In this paper we address this limitation of the Drazin inverse by defining a new generalized inverse which satisfies the condition of Eq. (8) while also preserving the rank of the original matrix.

## Ii The SC Generalized Matrix Inverse

One way to derive a generalized inverse that satisfies a particular consistency condition is to identify a normal form relative to that condition. For example, the singular value decomposition (SVD) decomposes an arbitrary matrix as

 A = UDV∗ (18)

where and are unitary and is a nonnegative diagonal matrix of the singular values of . Critically, is unique up to permutation, so all matrices with the same set of singular values are equivalent up to left and right unitary transformations. This allows the MP inverse to be derived as

 A\tiny-P = VD∼\tiny-1U∗ (19)

where simply inverts the nonzero elements of . From this the UC inverse can be similarly derived from a decomposition of the form

 A = DSE (20)

where and are nonnegative diagonal matrices and is unique up to left and right transformations by unitary diagonal matrices . The UC inverse can then be obtained as

 A\tiny-U = E\tiny-1S\tiny-PD\tiny-1 (21)

where the MP inverse can be applied because is unique up to unitary diagonal transformations and the MP inverse is consistent with respect to general unitary transformations.

Following a similar strategy, it can be recognized that the Jordan normal form of a square matrix is

 A = PJP\tiny-1 (22)

where is an arbitrary nonsingular matrix of the same size as , and is block-diagonal and unique up to permutation of the blocks. Thus we can derive from this decomposition the following similarity-consistent (SC) inverse:

 A\tiny-S = PJ\tiny-PP\tiny-1 (23)

where the MP inverse can be applied because is unique up to block-diagonal permutation transformations, which are of course unitary and thus permit the MP inverse to be consistently applied.

What remains is to formally show that this definition of is in fact consistent with respect to similarity transformations, i.e.,

 (SAS\tiny-1)%−S = SA\tiny-SS\tiny-1. (24)

The Jordan decomposition of is

 A = PJP\tiny-1 (25)

so the Jordan decomposition of must be

 SAS\tiny-1 = S(PJP% \tiny-1)S\tiny-1 (26) = (SP)J(P\tiny-1S\tiny-1) (27) = (SP)J(SP)\tiny-1 (28)

and thus the SC inverse of is

 (SAS\tiny-1% )\tiny-S = ((SP)J(% SP)\tiny-1)\tiny-S (29) = (SP)J\tiny% -P(P\tiny-1S\tiny-1) (30) = S(PJ\tiny-% PP\tiny-1)S\tiny-1 (31) = SA\tiny-SS\tiny-1. (32)

This establishes similarity consistency, and because the Moore-Penrose inverse preserves rank, the SC inverse does as well. The key difference between the SC inverse and the Drazin inverse is that the latter enjoys the analytically important property that:

 AA\tiny-D = A\tiny-DA (33)

whereas product commutativity of and is not a property of the SC inverse because does not generally equal .

## Iii Example and Practical Considerations

Here we consider an illustrative numerical example involving a nilpotent matrix

 (34)

with Jordan form where

 (35)

It can be verified that

 (36)

where the Drazin inverse of a nilpotent matrix is always zero, which graphically demonstrates complete loss of rank111The integer matrices in this section were constructed using the method of  so that inverses are also integer matrices. This allows the results to be more conveniently verified by direct hand calculations.. We now consider a matrix similar to with

 (37)

Similarity consistency for the Drazin inverse is trivially maintained because . By contrast, it can be verified that the rank of the SC inverse

 (38)

is the same as and that the similarity consistency condition is satisfied. And of couse it can also be verified that satisfies the generalized inverse conditions and . When this example is numerically evaluated using the Matlab jordan(M) and pinv(M) functions the SC inverse is found to be accurate to machine precision. However, when applied in two different high data-rate robotic control applications the observed errors using the SC inverse were sufficiently large to cause the control process to diverge rapidly222These simulation experiments were performed using variations on models for a robot arm and a rover examined in . Unit consistency was the focus of that work and the two failed tests of the SC inverse were neither analyzed nor reported..

The numerical instability of the Jordan decomposition is well known , and this problem is exacerbated when applied with the MP inverse because large errors can be introduced into singular values that should be zero, which can lead to huge spike errors when inverted. It is not clear to what extent this issue can be mitigated when these two operations are used in succession to compute the SC inverse. In many practical dynamical-system applications the system matrix changes relatively slowly over time, so it may be possible to track the evolution of its distribution of singular values to actively avoid catastrophic discontinuities in the case of a singular value near the threshold between zero and nonzero when the MP inverse is applied during evaluation of the SC inverse. More specifically, a tracked presumed-zero singular value that drifts above the MP threshold can be maintained as zero while a tracked presumed-nonzero singular value can be maintained as nonzero even if it drifts below the threshold.

A potentially more general approach for mitigating some of the numerical fragility is to either jointly solve the Jordan+MP calculation or to relax the form of the decomposition, e.g., so that the central matrix is unique only up to orthonormal similarity to a complex symmetric matrix rather than permutation similarity to a Jordan matrix. For example, any matrix can be decomposed  as

 A = SXS%−1 (39)

where is complex symmetric and thus the SC inverse can be defined as:

 A\tiny-S = SX\tiny-PS\tiny-1. (40)

That the decomposition of Eq. (40) is only unique up to unitary/orthonormal similarity can be inferred from the fact that for complex symmetric and orthonormal (), implies

 A\tiny-S = SX\tiny-PS\tiny-1 (41) = S(QYQ\tiny{T})\tiny-PS\tiny-1 (42) = (SQ)Y\tiny% -P(SQ)\tiny-1 (43)

where the last step is obtained by exploiting the unitary consistency property of the MP inverse. It is reasonable to expect that this less rigid definition will be more amenable to numerically-robust evaluation than is possible using a construction requiring an explicit Jordan decomposition.

## Iv Fixed-Rank Assumption

In some applications it is known that the rank of the time-evolving system matrix can potentially change, e.g., during transitions through gimbal-lock configurations, while in others it can be expected to remain constant. In the latter case a fixed-rank assumption can be exploited to mitigate the problem of spurious singular values when applying the MP inverse to the Jordan matrix during evaluation of the SC inverse. Specifically, if a fixed rank of is known/assumed then the MP inverse can be evaluated explicitly using the SVD with all but the largest singular values set identically to zero instead of having to discriminate with a threshold.

The efficacy of exploiting the fixed-rank assumption can be demonstrated in a carefully controlled experiment involving a dynamical 3-dimensional system that is confined to a 2-dimensional subspace such that the system matrix can be assumed to have a fixed rank of . The testing model can be further tailored so that the system matrix is guaranteed to never be diagonalizable and thus is maximally susceptible to the numerical issues discussed in the previous section. We achieve this by defining the Jordan form of the system as a function of time as

 (Q(t)S)J(Q(t)S)\tiny-1 (44)

with

 J=110010000 ,  Q(t)=cos(t)−sin(t)0sin(t) cos(t)0001 (45)

and is an arbitrary fixed nonsingular matrix. Thus the system matrix is singular and has an orbit period of during which it is never diagonalizable. The plot of Figure 1 shows the maximum absolute elementwise deviation between the computed SC inverse and the ground-truth known SC inverse for each of 300 uniform timesteps , . It is seen that spike errors occur at a significant fraction of the timesteps. These spikes are actually of magnitude and result from inverting spurious singular values that exceed pinv’s zero threshold. In tests of many randomly-generated matrices the number of spike errors over the entire 300-timestep simulation ranged from zero to more than , with Figure 1 showing an example of the latter. Fig. 1: Error magnitudes (vertical axis) resulting from standard evaluation of the SC inverse from the Jordan form over 300 timesteps (horizontal axis). Clipped magnitudes are actually of size ≈1014.

Figure 2 shows the results of the same scenario but with the smallest singular value of the Jordan matrix always treated as identically zero for computation of the SC inverse at each timestep. The results shown in Figure 2 are typical of those observed for the fixed-rank method in the battery of tests over many randomly-generated

matrices. By design it is likely that the model used for these tests overestimates the frequency of spike errors that can be expected from standard evaluation of the SC inverse in real-world applications, but because any such errors will be time-correlated it likely will not be practical to simply identify and discard them (e.g., apply some sort of heuristic outlier filter) on the assumption that they will only occur sporadically. In other words, the intrinsic numerical instability of the SC inverse cannot be mitigated in the general case. However, the results of Figure 2 suggest that the problem may be effectively mitigated in a large class of applications. Fig. 2: Error magnitudes (vertical axis) resulting from fixed-rank evaluation of the SC inverse from the Jordan form over 300 timesteps (horizontal axis). The mean error magnitude is 1.4420e-07 and the maximum magnitude is 2.3101e-06.

## V Discussion

Similarity transformations capture the most general linear transformations that can be applied to the common coordinate frame of a set of input and output variables, e.g., as can arise in machine learning applications in which little is known about the best space in which to represent an input-output mapping. In this paper we defined a new similarity-consistent generalized matrix inverse that preserves/maintains the rank of the original matrix, a property which is not preserved by the Drazin inverse. It has been noted that the numerical evaluation of the SC inverse can be a challenging practical limitation. One possible approach for addressing this problem is to relax the structure of the decomposition from requiring similarity to a Jordan matrix to similarity to a complex symmetric matrix for which uniqueness is only required up to orthonormal similarity, rather than permutation similarity, so the MP inverse is still applicable. This less-rigid formulation may be more amenable to numerically-robust evaluation, especially in conjunction with the fixed-rank method. Regardless of the choice of decomposition, it would be interesting to examine whether recent methods developed for evaluating the Drazin inverse can be applied to the SC inverse

[11, 9].

Despite its computational challenges, the SC inverse is the only general option available for applications that demand consistency with respect to arbitrary linear transformations of a common coordinate frame. We have provided evidence that these challenges can be effectively addressed in applications in which the rank of the system can be assumed fixed during its time evolution, but we have also emphasized that intrinsic numerical instabilities represent a fundamental obstacle in the general case. This motivates continued work toward identifying special properties of particular applications that can be exploited to permit the SC inverse to be practically applied.

## References

•  Ben-Israel, Adi; Greville, Thomas N.E., “Generalized inverses: Theory and Applications, ” New York: Springer, 2003.
•  E. Bristol, “On a new measure of interaction for multivariable process control,” IEEE Transactions on Automatic Control, vol. 11, no. 1, pp. 133-134, 1966.
•  S.L. Campbell, C.D. Meyer, and N.J. Rose, “Applications of the Drazin Inverse to Linear Systems of Differential Equations with Singular Constant Coefficients,” SIAM Journal of Applied Mathematics, Vol. 31, No. 3, 1976.
•  J. Chang and C. Yu, “The relative gain for non-square multivariable systems,” Chemical Engineering Science, vol. 45, no. 5, pp1̇309-1323, 1990.
•  Drazin, M. P., “Pseudo-inverses in associative rings and semigroups”, The American Mathematical Monthly, 65 (7): 506-514, 1958.
•  G. Golub and C. Van Loan, Matrix computations, Baltimore: Johns Hopkins, 2013.
•  Robert Hanson, “Matrices Whose Inverse Contain Only Integers,” The Two-Year College of Mathematics Journal, Vol 13, No. 1, 1982.
•  R.A. Horn and C.R. Johnson, Matrix Analysis, Cambridge University Press, 1999.
•  V.Y. Pan; F. Soleymani; L. Zhao; “Highly Efficient Computation of Generalized Inverse of a Matrix,” Applied Mathematics and Computation, 316, 2016.
•  B. Simeon, C. Fuhrer, P. Rentrop, “The Drazin inverse in multibody system dynamics”, Numer. Math, 64, pp. 521-539, 1993.
•  F. Toutounian and R. Buzhabadi, “New Methods for Computing the Drazin-Inverse Solution of Singular Linear Systems,” Applied Mathematics and Computation, 294:343–352, 2017.
•  Jeffrey Uhlmann, “Unit Consistency, Generalized Inverses, and Effective System Design Methods,” arXiv:1604.08476v2 [cs.NA] 11 Jul 2017 (2015).
•  Jeffrey Uhlmann, “A Generalized Matrix Inverse that is Consistent with Respect to Diagonal Transformations,” SIAM Journal on Matrix Analysis (SIMAX), Accepted 2018.
•  Bo Zhang and Jeffrey Uhlmann, “Applying a New Generalized Matrix Inverse For Stable Control of Robotic Systems,” (submitted to IEEE Trans. on Robotic Systems, 2018).