## 1 Introduction

Spectral decomposition of real-valued matrices (eigendecomposition) is a task of utmost importance for mathematicians, physicists and engineers. Specifically, the decomposition of matrices plays a central role in three dimensional space as it is characteristic to many real-world application contexts. This special spectral decomposition problem was studied for centuries, especially due to its close relation to the roots of the cubic equation. Closed-form solutions found increasingly more use with the advent of computers and powerful symbolic systems as Mathematica [18] or SymPy [10]

. The symbolic approach represents an efficient tool for computation of eigenvalues and eigenvectors while preserving their functional dependence on the matrix elements. It has the potential to be exploited together with automatic differentiation (AD) in problems where derivatives of spectral decomposition are required (e.g. non-linear problems in principal space, sensitivity analysis).

Unfortunately, when results of such eigendecomposition are implemented and evaluated in
computer software with finite precision arithmetic, not all closed-form approaches are
equivalent^{1}^{1}1It might sound confusing to talk about a symbolic algorithm having finite precision issues.
What is meant here is that when a symbolic expression gets evaluated for finite precision
inputs the result will contain rounding error.
It is implicitly assumed that the symbolic engine which evaluates expressions does not
perform any aggressive simplifications or optimisations on the expression tree..
Very few existing papers address accuracy and sensitivity of this decomposition in the
context of a symbolic algorithm, see e.g. [8, 4].
Papers that study finite precision accuracy of closed-form roots to the cubic equation are more
common, but less so in the context of eigenvalues and eigenvectors, [7]

. Heuristic approaches were developed by the engineering community in order to overcome rounding issues, but none effectively improves the accuracy of results, see

[15, 11, 6].While well-established iterative methods can provide eigenvalues and eigenvectors up to very high precision, their inherent nature (presence of loops, non-predetermined number of required iterations, stopping criteria and conditionals) renders them not suitable for use within symbolic algorithms. Common implementations of iterative schemes can be found in LAPACK [1], Numerical Recipes in C [14] or GNU Scientific Library [2].

The aim of this paper is to study rounding errors in closed-form solution to the spectral decomposition. Alternative – but mathematically equivalent – expressions are sought such that rounding effects are diminished. An ideal set of symbolic expressions must not require the evaluation of loops or taking advantage of any finite precision specific tricks, such that the resulting mathematical expression is robust and ready for direct use within a symbolic framework.

## 2 Spectral decomposition

For a diagonalisable matrix the multiplicative spectral decomposition over the real numbers

(1) |

is given in terms of the real-valued matrices and (which, respectively, contain the right eigenvectors column-wise and the left eigenvectors row-wise) and the diagonal matrix , holding the real-valued eigenvalues . The (scaled) eigenvectors in and fulfil the requirement . The equivalent additive real-valued spectral decomposition

(2) |

can be written in terms of the product of the eigenvalue and its associated eigenprojector . Independent of eigenvalue multiplicities, distinct eigenprojectors could always be found such that they have then the following properties

(3) |

### 2.1 Eigenvalues and discriminant

The formulation of the eigenvalue problems

(4a) | ||||

(4b) |

(or, alternatively, and ) leads to the characteristic polynomial

(5) |

of matrix . The discriminant of the characteristic polynomial is defined as the product of the squared distances of the roots

(6) |

and is zero in the case of repeated eigenvalues. The discriminant associated with matrix is a function of the matrix elements and it has been shown by Parlett [13] that the discriminant can be expressed as the determinant of a symmetric matrix

(7) |

with elements for and a factorisation into the matrix and the matrix , which are – similar to the Vandermonde matrix – constructed from powers of as follows (the operator represents column-stacking)

(8) |

Using the Cauchy-Binet formula for the determinant of a matrix product, the discriminant is identified as a sum-of-products

(9) |

where the determinants of the minors, and

, are collected in the vectors

and , respectively. The expansion (9) has nonzero terms of which several may occur repeatedly. Thus, a condensed form of the sum-of-products discriminant representation can be established(10) |

in which and contain only unique factors and the diagonal matrix holds the respective product multiplier.

#### 2.1.1 Sub-discriminants

The expression for the discriminant based on the determinant of matrix suggests a generalisation into certain invariants (in this paper called sub-discriminants). The matrix is factored into two rectangular matrices which represent column and row-stacked powers . The sub-discriminant corresponding to multi-indices and is defined based on subsets and of powers, i.e.

(11) |

where

(12) |

It is important to note, that all sub-discriminants are proper invariants of matrix (they are invariant under similarity transformations). Moreover, each component of the matrix is an invariant, since for all and all .

Following its definition simple identities are identified,

(13) | ||||

(14) |

### 2.2 Eigenprojectors

The eigenvalues – as roots of the characteristic polynomial – are nonlinear functions of the elements of . A relation for the eigenprojector is obtained by forming the inner product of the total differential of (4b) with and by using equation (4a) together with identities (3) as follows

(15a) | ||||

(15b) | ||||

(15c) | ||||

(15d) | ||||

(15e) | ||||

(15f) |

from which, for arbitrary , one extracts an expression for the eigenprojector

(16) |

in terms of differentiation through the eigenvalue. If is available as differentiable symbolic expression (as it is the object of this paper), then Equation (16

) presents advantages over the direct computation of the Frobenius covariants as Lagrange interpolants

(17) |

obtained as a consequence of Equations (2) and (3) for strictly distinct eigenvalues . Alternatively, the eigenprojectors can be computed from

(18) |

where is a canonical unit vector (having in the -th component and elsewhere) and the operator mapping vectors into diagonal matrices. In this case matrices and must be otherwise known, so this approach is useful only for testing purposes.

## 3 Spectral decomposition of matrices

In the following we consider the case of a matrix with real-valued elements and under the (above stated) assumption that is diagonalisable over the real numbers.

### 3.1 Eigenvalues

The characteristic polynomial of a matrix

(19) |

can be expressed in terms of the principal invariants of

(20) |

or in terms of its three roots , and .

#### 3.1.1 Roots of the characteristic polynomial

In order to facilitate the identification of the roots of the cubic equation the substitution

(21) |

is introduced, with to be determined. This leads, in a first step, to

and is transformed to

using the triple-angle trigonometric identity . In the above expression for the factors to and become zero for suitable choices of and . The non-trivial solution of the associated system of equations

(22) |

is given by

(23) |

and allows to eliminate and in . This provides

and with

(24) |

the relation to determine the cosine of the triple angle

(25) |

with .

The three eigenvalues of matrix

(26) |

are then highly nonlinear functions of the principal matrix invariants.

#### 3.1.2 Discriminant and sub-discriminants

The discriminant of the (cubic) polynomial can be expressed in terms of its coefficients (referred to as “naive” expression in this paper)

(27a) | ||||

(27b) |

where invariants and are defined as

(28a) | ||||

(28b) |

Alternatively, for the condensed sum-of-products representation of the discriminant, see Equation (10), is given by 14 products with

(29a) | ||||

(29b) | ||||

(29c) |

It should be noted that *symmetric* matrices allow for further reduction of the number
of products and presentation of the discriminant as a sum-of-squares, as demonstrated
by Kummer [9] (7 squares) and Watson [17] (5 squares).

With the notion of sub-discriminants it can be shown that

(30) |

It follows from

(31) |

With similar factorisation into sum-of-products an alternative expression for the invariant reads

(32a) | ||||

(32b) | ||||

(32c) |

A similar expression cannot be identified for , however, this invariant can partially be expressed on the basis of sum-of-products as follows

(33) |

## 4 Finite precision and rounding errors

The above formulas for the computation of eigenvalues and eigenprojectors are not all equivalent when finite precision floating point arithmetic is involved. This is, of course, the case when these expressions are implemented in a computer program. The discrepancies will be explained and addressed below.

Machine epsilon, , is here defined as the difference between 1 and the next larger floating point number. Let be the radix (base) and a precision of a floating point number representation, then . For example, the standard double precision floating point representation used in this paper has . Only the magnitude of the rounding error is considered in this paper, so there is no distinction made between units in the last place and relative error in terms of machine epsilon.

### 4.1 Rounding errors in principal invariants

An inherent property of the analytical approach to spectral decomposition follows as a consequence of its use of matrix invariants and . As noted in [7], this approach is not suitable when the distances between eigenvalues are large. As a simple example, consider a matrix with and . Any computation which then makes use of matrix trace will suffer from rounding error (in standard double precision), to the effect that the presence of the two small eigenvalues is not measurable in the matrix trace.

Due to non-associativity of the floating point arithmetic the case with eigenvalues and would also introduce large relative error into the computation of the trace . Here, denotes floating point representation of number . Such issues could be solved using techniques as the Kahan summation algorithm (or the improved Kahan-Babuška algorithm [12]). Unfortunately, tricks based on benign cancellation are not applicable when developing a symbolic algorithm, since the actual execution order of floating point operations is not strictly preserved.

In [7] the alternative use of iterative methods is recommended (Jacobi/QR and QL). These methods, however, are not relevant in the context of symbolic computation.

### 4.2 Catastrophic cancellation in the discriminant

Catastrophic cancellation is a numerical phenomenon where the subtraction of the good approximation to two close numbers results in a bad approximation in the result. The formula for discriminant based on principal invariants (Equation (27a) and (27b)) suffers from this effect.

Let be a diagonalisable traceless matrix with real eigenvalues and . The parameter is a distance between two largest eigenvalues and for there is . For this matrix

If is smaller than machine epsilon, then the computation and storage of and introduces a rounding error, such that

The discriminant computed using formula based on principal invariants (27a) then leads to

This situation occurs for standard double precision when . In other words, the discriminant computed from the naive expression (27a) becomes insignificant when the distance between two eigenvalues is smaller than half precision. The loss of precision could be avoided if a different expression for the discriminant is used, which motivates the following discussion.

Fortunately, the sum-of-products expression for the discriminant (Equation (10)) does not suffer from catastrophic cancellation when the discriminant goes to zero. This observation is the foundation for computation of eigenvalues (and determination of eigenvalues multiplicity) with improved accuracy and is proved in the following.

For the special case of *symmetric* matrix, sum-of-products reduces to sum-of-squares,
i.e. the vectors and are equal.
Assuming that the distance between the two closest eigenvalues is some , the
discriminant is proportional to (since the discriminant is the square of eigenvalue
distances) and every (positive) summand in the scalar product
is thus either zero or a small number
proportional to .
Every non-zero element in the vector is therefore proportional to and
could be computed up to machine precision. Finally, the sum-of-squares of small numbers of the same
magnitude does not introduce a significant rounding error.

In the general *non-symmetric* case, it must be shown that each non-zero element in
and approaches zero dominated by terms linear in .
First, the limit case for (i.e. ) is proven.
Equivalently, it is required that

(34) |

where runs through all square sub-matrices of and . This is a stronger statement than which follows easily from Equation (7).

It could be shown, that for a diagonalisable matrix , which has zero discriminant, the following matrices

(35) |

are linearly dependent^{2}^{2}2Matrix is a root of its minimal polynomial, which is of order .
For a diagonalisable matrix every eigenvalue has its algebraic multiplicity equal to
geometric multiplicity.
Therefore, a zero discriminant implies that the degree of minimal polynomial is strictly smaller
than the matrix size ..
In other words, there exist coefficients such that

(36) |

Alternatively, (36) could be written using the operator as

(37) |

This equation states that the rows in matrix are linearly dependent, thus the rows in all sub-matrices are linearly dependent too. Hence, for each . The same observation applies to the transpose of (36) and columns of , so that equally, . Moreover, since the discriminant is a continuous function of distances between eigenvalues – follows from its definition (6) –, if then each and . Finally, since discriminant is proportional to , each non-zero element in and is proportional to .

##### Note:

### 4.3 Catastrophic cancellation in the invariant

The invariant plays a foremost role in the computation of eigenvalues. For being a diagonalisable matrix with eigenvalues and one obtains . For this matrix

Similarly to arguments for cancellation in the discriminant, when is computed and stored for smaller than machine epsilon, one observes that

Thankfully, the sum-of-products expression for , see Equation (32), does not suffer from catastrophic cancellation by similar arguments as used in the proof for the discriminant. It could be shown that for proportional to every non-zero term in vectors and is proportional to . It is a consequence of the linear dependence of matrices and , which follows from that fact that degree of minimal polynomial in the case of matrix with .

### 4.4 Rounding error in the triple angle

The evaluation of the triple angle in Equation (25) suffers from dangerous rounding errors in the vicinity of . In order to demonstrate these effects, let again consider a traceless matrix with eigenvalues and . Direct inversion of (25) gives

(38) |

and series expansion of the inverse cosine argument around the critical point (i.e. ) for the chosen example matrix leads to

Similar to the discussion above, when the argument is evaluated and stored in floating point representation rounding error becomes significant if the eigenvalue distance is smaller than square root of machine epsilon. This effect could be avoided in two ways.

The first approach is based on the generalised (Puiseux) series approximation to the inverse cosine

(39) |

around point . The presence of the square root in the expansion suggests a possible series approximation of the angle which would contain linear terms in the eigenvalue distance (or equivalently, terms proportional to ). Indeed, the expansion of the angle around reads

Comments

There are no comments yet.